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1 Introduction 


In experimental work e.g. in physics one often encounters problems where a standard 
statistical probability density function is applicable. It is often of great help to be able 
to handle these in different ways such as calculating probability contents or generating 
random numbers. 

For these purposes there are excellent text-books in statistics e.g. the classical work of 
Maurice G. Kendall and Alan Stuart [1,2] or more modern text-books as [3] and others. 
Some books are particularly aimed at experimental physics or even specifically at particle 
physics [4, 5,6,7, 8]. Concerning numerical methods a valuable references worth mentioning 
is [9] which has been surpassed by a new edition [10]. Also hand-books, especially [11], has 
been of great help throughout. 

However, when it comes to actual applications it often turns out to be hard to find de¬ 
tailed explanations in the literature ready for implementation. This work has been collected 
over many years in parallel with actual experimental work. In this way some material may 
be “historical” and sometimes be naive and have somewhat clumsy solutions not always 
made in the mathematically most stringent may. We apologize for this but still hope that 
it will be of interest and help for people who is struggling to find methods to solve their 
statistical problems in making real applications and not only learning statistics as a course. 
Even if one has the skill and may be able to find solutions it seems worthwhile to have 
easy and fast access to formulae ready for application. Similar books and reports exist e.g. 
[12,13] but we hope the present work may compete in describing more distributions, being 
more complete, and including more explanations on relations given. 

The material could most probably have been divided in a more logical way but we 
have chosen to present the distributions in alphabetic order. In this way it is more of a 
hand-book than a proper text-book. 

After the first release the report has been modestly changed. Minor changes to cor¬ 
rect misprints is made whenever found. In a few cases subsections and tables have been 
added. These alterations are described on page 182. In October 1998 the first somewhat 
bigger revision was made where in particular a lot of material on the non-central sampling 
distributions were added. 

1.1 Random Number Generation 

In modern computing Monte Carlo simulations are of vital importance and we give meth¬ 
ods to achieve random numbers from the distributions. An earlier report dealt entirely 
with these matters [14], Not all text-books on statistics include information on this subject 
which we find extremely useful. Large simulations are common in particle physics as well as 
in other areas but often it is also useful to make small “toy Monte Carlo programs” to inves¬ 
tigate and study analysis tools developed on ideal, but statistically sound, random samples. 

A related and important field which we will only mention briefly here, is how to get 
good basic generators for achieving random numbers uniformly distributed between zero 
and one. Those are the basis for all the methods described in order to get random numbers 
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from specific distributions in this document. For a review see e.g. [15]. 

From older methods often using so called multiplicative congruential method or shift- 
generators G. Marsaglia et al [16] introduced in 1989 a new “universal generator” which 
became the new standard in many fields. We implemented this in our experiments at 
CERN and also made a package of routines for general use [17]. 

This method is still a very good choice but later alternatives, claimed to be even better, 
have turned up. These are based on on the same type of lagged Fibonacci sequences as 
is used in the universal generator and was originally proposed by the same authors [18]. 
An implementations of this method was proposed by F. James [15] and this version was 
further developed by M. Liischer [19] . A similar package of routine as was prepared for the 
universal generator has been implemented for this method [20]. 
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2 Probability Density Functions 


2.1 Introduction 

Probability density functions in one, discrete or continuous, variable are denoted p(r) and 
f{x), respectively. They are assumed to be properly normalized such that 


&(' r ) = 1 


and 


j f{x)dx 
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where the sum or the integral are taken over all relevant values for which the probability 
density function is defined. 

Statisticians often use the distribution function or as physicists more often call it the 
cumulative function which is defined as 

r x 

P{r) = X pW anc ^ F( x ) = J f(t)dt 


2.2 Moments 

Algebraic moments of order r are defined as the expectation value 

OO 

n' r = E{x r ) = ^2 k r p(k) or j x r f(x)dx 

^ — OO 

Obviously // 0 = 1 from the normalization condition and is equal to the mean, sometimes 
called the expectation value, of the distribution. 

Central moments of order r are defined as 

fir = E{{k - E(k)) r ) or E((x-E(x)) r ) 

of which the most commonly used is fi 2 which is the variance of the distribution. 

Instead of using the third and fourth central moments one often defines the coefficients 
of skewness 71 and kurtosis 1 y 2 by 


IE , d 4 Q 

71 = — and 72 = — - 3 

dl ^ 

where the shift by 3 units in 72 assures that both measures are zero for a normal distribution. 
Distributions with positive kurtosis are called leptokurtic, those with kurtosis around zero 
mesokurtic and those with negative kurtosis platykurtic. Leptokurtic distributions are 
normally more peaked than the normal distribution while platykurtic distributions are 
more flat topped. 

^rom greek kyrtosis = curvature from kyrt(os) = curved, arched, round, swelling, bulging. Sometimes, 
especially in older literature, 72 is called the coefficient of excess. 
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2.2.1 Errors of Moments 

For a thorough presentation of how to estimate errors on moments we refer to the classical 
books by M. G. Kendall and A. Stuart [1] (pp 228-245). Below only a brief description is 
given. For a sample with n observations x\, x-i, ■ ■ ■, x n we define the moment-statistics for 
the algebraic and central moments m' r and m r as 

2 n i n 

m! r = x r and m r = — ^ (x — m[) r 

n r=0 n r=0 

The notation m' r and m r are thus used for the statistics (sample values) while we denote 
the true, population, values by ji' r and fi r . 

The mean value of the r:th and the sampling covariance between the q:th and r:th 
moment-statistic are given by. 


E(m' r ) = fj! r 

Cov(m' q , m' r ) = - (lJ q+r ~ (JLgfJLr) 

These formula are exact. Formulae for moments about the mean are not as simple since 
the mean itself is subject to sampling fluctuations. 

E(m r ) = n r 

Cov(m q , m r ) — (f-lq+r l^ql^r T l/^g —1 ^f-^r—lf^q +1 Q(J‘r-\-l(J>q—l) 

to order 1 /a /n and 1/n, respectively. The covariance between an algebraic and a central 
moment is given by 

Cov(m r ,m q ) ^ {jlq+ r /J'ql^r ^l^q+ll^i — l) 

to order 1/n. Note especially that 

V(m' r ) = i (/X' 2r - fi'r) 

V (m r ) = 1 (l-l‘ 2 r ~ l4 + r 2 /i2/Wr-i ~ 2r/i r _i/i r+ i) 

n K J 

Cov(m' 1 ,m r ) = - (/i r+ i - r/i 2 /i r -i) 


2.3 Characteristic Function 

For a distribution in a continuous variable x the Fourier transform of the probability density 
function 

OO 

(j)(t) = E{e lxt ) = J e lxt f(x)dx 

— OO 
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is called the characteristic function. It has the properties that 0(0) = 1 and |0(t)| < 1 
for all t. If the cumulative, distribution, function F(x ) is continuous everywhere and 
dF(x) = f(x)d.x then we reverse the transform such that 

1 00 

nO = ^ / tfMe—dS 

— OO 


The characteristic function is related to the moments of the distribution by 


Mt) = E(e'>*) = £ = £ MX 


n=0 


n\ 


n =0 


n\ 


e.g. algebraic moments may be found by 




/ 

r 


l 



0 (f) 


t =0 


To hnd central moments (about the mean /i) use 


0 x _ M (t) = E (e< x ~^) = e-^Ut) 

and thus 

* - ? (i) e ^L 

A very useful property of the characteristic function is that for independent variables x 
and y 

0 rr +2 /if'} 0a:(^) ‘ 0y(^) 

As an example regard the sum £] a t z t where the zf s are distributed according to normal 
distributions with means /p and variances of. Then the linear combination will also be 
distributed according to the normal distribution with mean Jfaifii and variance ]£a^of. 

To show that the characteristic function in two variables factorizes is the best way to 
show independence between two variables. Remember that a vanishing correlation coeffi¬ 
cient does not imply independence while the reversed is true. 


2.4 Probability Generating Function 

In the case of a distribution in a discrete variable r the characteristic function is given by 

<Kt) = E(e' tr ) = Zp(ry ,r 

In this case it is often convenient to write z = e lt and define the probability generating 
function as 

G(z) = E(z r ) = J2'P( r )z r 
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Derivatives of G(z) evaluated at z = 1 are related to factorial moments of the distribu¬ 
tion 


d 


G \l) = T G(z) 

G2(1) = 1? G(Z) 
G M) = £g{z) 

CM) = £<?(*) 


G(l) = 1 ( normalization ) 

= E (r ) 


2=1 


2=1 


2=1 


2=1 


E(r(r — 1)) 

E(r(r — l)(r — 2)) 

E(r{r — l)(r — 2) • • • (r — k + 1)) 


Lower order algebraic moments are then given by 


hi = ^(l) 

/4 = G 2 (l) + G rl (l) 

/4 = G 3 (1) + 3G 2 (1) + G' 1 (1) 

/4 = G 4 (1) + 6G 3 (1) + 7G 2 (1) + G 1 (1) 


while expression for central moments become more complicated. 

A useful property of the probability generating function is for a branching process in n 
steps where 

G(z) = Gi(G 2 (. ■ ■ G n -i(G n (z ))...)) 

with Gk(z) the probability generating function for the distribution in the h:th step. As an 
example see section 29.4.4 on page 105. 


2.5 Cumulants 


Although not much used in physics the cumulants, k t , are of statistical interest. One 
reason for this is that they have some useful properties such as being invariant for a shift 
in scale (except the first cumulant which is equal to the mean and is shifted along with 
the scale). Multiplying the x-scale by a constant a has the same effect as for algebraic 
moments namely to multiply n r by a r . 

As the algebraic moment n' n is the coefficient of (it) 71 jn\ in the expansion of 0(f) the cu¬ 
mulant K n is the coefficient of {it) n jn\ in the expansion of the logarithm of 0(f) (sometimes 
called the cumulant generating function) i.e. 


In 0(f) = 


n— 1 


(it) r 


n\ 


~Kr, 


and thus 


W(i) 


t =0 


Relations between cumulants and central moments for some lower orders are as follows 
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ftl = lA 


ft2 = ^ 2 

A 4 2 — ft2 

k 3 ~ l l 'i 

/D = ft 3 

k 4 = Ha — 3/i| 

/i 4 = ft 4 + 3«2 

ft 5 = /^5 — 10/i,3/i2 

A*5 = ft5 + 10ft 3 ft 2 

ft6 = A*6 — 15/24^2 — 10yu| + 30/i2 

/if, = ft6 T 15/74^2 + 10ft§ + 15/Ci 

ft7 = f^7 - 21 y 5 y 2 - 35/i 4 /i 3 + 210^3^2 

/i7 = ^7 + 21^5^2 + 35 k 4 /73 + 105^3^2 

ft8 = h-8 28/ig/i2 — 56yU 5 /i3 — 35;U 4 + 

ft = ft 8 + 28KqK,2 + 56 K 5 AC 3 + 35k|+ 

+420^4^1 + 560/^3/12 — 630/4 

+210/74^2 + 280/C§ft 2 + 105/^2 


2.6 Random Number Generation 

When generating random numbers from different distribution it is assumed that a good 
generator for uniform pseudorandom numbers between zero and one exist (normally the 
end-points are excluded). 

2.6.1 Cumulative Technique 

The most direct technique to obtain random numbers from a continuous probability density 
function f(x) with a limited range from x rn - m to x max is to solve for x in the equation 

c _ F{x) - F(x min ) 

f^max) F^Xmin) 

where £ is uniformly distributed between zero and one and F(x) is the cumulative dis¬ 
tribution (or as statisticians say the distribution function). For a properly normalized 
probability density function thus 

x = F~\0 

The technique is sometimes also of use in the discrete case if the cumulative sum may 
be expressed in analytical form as e.g. for the geometric distribution. 

Also for general cases, discrete or continuous, e.g. from an arbitrary histogram the 
cumulative method is convenient and often faster than more elaborate methods. In this 
case the task is to construct a cumulative vector and assign a random number according to 
the value of a uniform random number (interpolating within bins in the continuous case). 

2.6.2 Accept-Reject technique 

A useful technique is the acceptance-rejection, or hit-miss, method where we choose / max to 
be greater than or equal to f(x) in the entire interval between x m ; n and x max and proceed 
as follows 

i Generate a pair of uniform pseudorandom numbers £i and £ 2 . 
ii Determine x = x min + £1 • (x max - x min ). 
hi Determine y = / max • £ 2 . 

iv If y — f(x) > 0 reject and go to i else accept x as a pseudorandom number from f(x). 
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The efficiency of this method depends on the average value of f(x) // max over the in¬ 
terval. If this value is close to one the method is efficient. On the other hand, if this 
average is close to zero, the method is extremely inefficient. If a is the fraction of the area 
/max • (r'max — ^min) covered by the function the average number of rejects in step iv is ^ — 1 
and j- : uniform pseudorandom numbers are required on average. 

The efficiency of this method can be increased if we are able to choose a function h(x), 
from which random numbers are more easily obtained, such that /(x) < ahfx) = g(x) over 
the entire interval under consideration (where a is a constant). A random sample from 
f(x) is obtained by 

i Generate in x a random number from h(x). 

ii Generate a uniform random number f. 

iii Iff > f(x)/g(x) go back to i else accept x as a pseudorandom number from f(x). 

Yet another situation is when a function g(x), from which fast generation may be 
obtained, can be inscribed in such a way that a big proportion (/) of the area under the 
function is covered (as an example see the trapezoidal method for the normal distribution). 
Then proceed as follows: 

i Generate a uniform random number f. 

ii Iff < / then generate a random number from g(x). 

iii Else use the acceptance/rejection technique for h(x) = fix) — g{x) (in subintervals if 
more efficient). 

2.6.3 Composition Techniques 

If f(x) may be written in the form 


f( x ) = j g z (x)dH(z ) 

— OO 

where we know how to sample random numbers from the p.cl.f. g(x) and the distribution 
function H(z). A random number from f(x) is then obtained by 

i Generate two uniform random numbers fi and f 2 - 

ii Determine z = i/ -1 (f 1 ). 

iii Determine x = G^ 1 ^) where G z is the distribution function corresponding to the 
p.d.f. g z (x). 


For more detailed information on the Composition technique see [21] or [22], 


A combination of the composition and the rejection method has been proposed by 
J. C. Butcher [23]. If f{x) can be written 


f(x) = ^2ocifi(x)gi{x) 

i=0 

where a* are positive constants, fi(x) p.d.f.’s for which we know how to sample a random 
number and g t {x) are functions taking values between zero and one. The method is then 
as follows: 

i Generate uniform random numbers £1 and £ 2 . 

ii Determine an integer k from the discrete distribution p t = a*/(ai + a 2 + ••• + a n ) 
using 

iii Generate a random number x from / fc (x). 

iv Determine gk(x) and if £ 2 > fjk{x) then go to i. 

v Accept x as a random number from f(x). 

2.7 Multivariate Distributions 

Joint probability density functions in several variables are denoted by f(x 1 , x 2 , ■ ■ ■, x n ) and 
p(ri,r 2 , • • • , r n ) for continuous and discrete variables, respectively. It is assumed that they 
are properly normalized i.e. integrated (or summed) over all variables the result is unity. 

2.7.1 Multivariate Moments 

The generalization of algebraic and central moments to multivariate distributions is straight¬ 
forward. As an example we take a bivariate distribution f(x, y) in two continuous variables 
x and y and define algebraic and central bivariate moments of order k, i as 

lAe = E (x k y e ) = ff x k y e f(x,y)dxdy 

IJ-kt = E ((% - Vx) k (y ~ IXyY) = JJ(x - fi x ) k (y - HyYf(x , y)dxdy 

where y x and /i y are the mean values of x and y. The covariance is a central bivariate 
moment of order 1,1 i.e. Cov(x, y) = yn. Similarly one easily defines multivariate moments 
for distribution in discrete variables. 

2.7.2 Errors of Bivariate Moments 

Algebraic (m' rs ) and central (m rs ) bivariate moments are defined by: 

n 1 n 

m 'rs= - X \y\ and m rs = - m 'l0 Y(Vi ~ m 'oiY 

^ i=1 i=1 

When there is a risk of ambiguity we write m r>s instead of m rs . 
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The notations m! rs and m rs are used for the statistics (sample values) while we write 
/i' rs and /i r , s for the population values. The errors of bivariate moments are given by 


Cov(m' ra , rn u/ 
Cov(m rs , m u , 


especially 


V(m’ ri 
V (■ m r , 


(dr+u.s+v f*rsl*uv ) 

1 ( 

\t^r+u,s+v drsduv T ^'^d J 20d J r—l,sd J u—l,v T SVfXQ2f^r,s— lf^u,v— 1 


n 


T rv [iudr—i,sl^u,v—i T sufiudr,s—il^u—i,v ufi r +i,slJ' U —i,v 

U/lr.s+lhu.t)—1 T fl r —l ,s(J> U +l ,v S(J>r,s—lfJ>u,v+l) 


/ / /2 \ 

— \^2r,2s d'rs) 

~(d2r,2s ~ brs + ^ /^Ohr-l s + ^d02^r S -1 

n 


T2rs/tn/tf —i,s/b,s —1 2rp, r _|_i s p,,— i 2sp. r!S -(-i/i r]S _i) 

For the covariance (mu) we get by error propagation 


V(mn) = — (/x 2 2 — M11) 

n 

Cov(m u ,m' 10 ) = ^ 

C , ou(mn,m 2 o) = -(/t 3 i -/Whi) 

n 

For the correlation coefficient (denoted by p = Abi/yTb 0 /Z 02 f° r the population value and 
by r for the sample value) we get 




/40 ^04 2/122 

2 ' 2 

h 2 0 / h )2 h ' 20/^'02 


1 

/til 


ha 1 h 13 

/t -20 /t 02 


Beware, however, that the sampling distribution of r tends to normality very slowly. 

2.7.3 Joint Characteristic Function 

The joint characteristic function is defined by 

(j)(tut 2 , ...,t n ) = E(e ltlXl+lt2X2+ - tnXn ) = 

OO OO OO 

= J J ... J e lt ^ +lt2X2+ -- +lt ^ f( x 1 , x n )dxidx 2 ■ ■ ■ dx r 


— OO —OO —OO 


From this function multivariate moments may be obtained e.g. for a bivariate distribution 
algebraic bivariate moments are given by 


/4 = E(x\x s 2 ) = 


d r+s (f>(tut 2 ) 


(9(i/i) r (9(i/ 2 ) 


ll=t2=0 
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2.7.4 Random Number Generation 


Random sampling from a many dimensional distribution with a joint probability density 
function f(x \,x 2 , ••• ,x n ) can be made by the following method: 

• Define the marginal distributions 

9m(x\i X 2 ji---^ *£m) J f (*^1) • • •) X n )dx m +\dx m -\-2• • -dx n J ... 1 X rn -^-\)d>X m +\ 

• Consider the conditional density function h m given by 

hm{Xm \X1,X 2 , -Xm- 1) = g m (xi,X 2 , ...,Xm)/gm-l(xi,X 2 , ...,X m -i) 

• We see that g n = f and that 

J h m {Xm |*^1 , X 2 1 • • • j x m —\)dx m 1 

from the definitions. Thus h m is the conditional distribution in x m given hxed values 
for xi,x 2 ,...,x m _ 1 . 


• We can now factorize / as 

f(xi, x 2 ,..., x n ) = h 1 (x 1 )h 2 {x 2 \x 1 )... h n (x n \x lt x 2i ..., x n _i) 

• We sample values for xi, x 2 ,x n from the joint probability density function / by: 

— Generate a value for x\ from h\{x\). 

— Use X\ and sample x 2 from h 2 (x 2 \x\). 

— Proceed step by step and use previously sampled values for xi, x 2 ,x m to 
obtain a value for x m+ \ from h m+ i(x m+ i\xi,x 2 , ...,x m ). 

— Continue until all xf.s have been sampled. 

• If all xf.s are independent the conditional densities will equal the marginal densities 
and the variables can be sampled in any order. 
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3 Bernoulli Distribution 


3.1 Introduction 

The Bernoulli distribution, named after the swiss mathematician Jacques Bernoulli (1654- 
1705), describes a probabilistic experiment where a trial has two possible outcomes, a 
success or a failure. 

The parameter p is the probability for a success in a single trial, the probability for a 
failure thus being 1 — p (often denoted by q). Both p and q is limited to the interval from 
zero to one. The distribution has the simple form 

, N f 1 — p — q if r = 0 (failure) 

p{r;p) = < -r , ) \ 

[p it r = 1 (success) 

and zero elsewhere. The work of J. Bernoulli, which constitutes a foundation of probability 
theory, was published posthumously in Ars Conjectandi (1713) [24], 

The probability generating function is G(z ) = q+pz and the distribution function given 
by -P(O) = q and -P(l) = 1. A random numbers are easily obtained by using a uniform 
random number variate £ and putting r = 1 (success) if £ < p and r = 0 else (failure). 

3.2 Relation to Other Distributions 

From the Bernoulli distribution we may deduce several probability density functions de¬ 
scribed in this document all of which are based on series of independent Bernoulli trials: 

• Binomial distribution: expresses the probability for r successes in an experiment 
with n trials (0 < r < n). 

• Geometric distribution: expresses the probability of having to wait exactly r trials 
before the first successful event (r > 1). 

• Negative Binomial distribution: expresses the probability of having to wait ex¬ 
actly r trials until k successes have occurred (r > k). This form is sometimes referred 
to as the Pascal distribution. 

Sometimes this distribution is expressed as the number of failures n occurring while 
waiting for k successes (n > 0). 
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4 Beta distribution 


4.1 Introduction 


The Beta distribution is given by 

f{x;p,q) = 1 x p - l {l-x) q ~ l 

B(p, q) 


where the parameters p and q are positive real quantities and the variable 
x < 1. The quantity B(p,q) is the Beta function defined in terms of the 
Gamma function as 


B(p, q) 


T(p)T(q) 
r (p + q) 


x satisfies 0 < 
more common 


For p — q — 1 the Beta distribution simply becomes a uniform distribution between 
zero and one. For p — 1 and q — 2 or vise versa we get triangular shaped distributions, 
fix) = 2 — 2x and f(x) = 2x. For p = q = 2 we obtain a distribution of parabolic shape, 
f(x) = 6x(l — x). More generally, if p and q both are greater than one the distribution has 
a unique mode at x = (p — l)/(p + q — 2) and is zero at the end-points. If p and/or q is 
less than one /(0) —» oo and/or /(1) —> oo and the distribution is said to be J-shaped. In 
figure 1 below we show the Beta distribution for two cases: p = q = 2 and p = 6, q = 3. 



Figure 1: Examples of Beta distributions 


4.2 Derivation of the Beta Distribution 

If y m and y n are two independent variables distributed according to the chi-squared distri¬ 
bution with m and n degrees of freedom, respectively, then the ratio y m /(y m + y n ) follows 
a Beta distribution with parameters p = y and q—\- 
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To show this we make a change of variables to x — y m /(y m + y n ) and y = ym + Un which 


f{*,y) = 


Vn 

y ( 1 - 

dy m 

dy m 

dx 

dy 

dy n 

dyn 

dx 

dy 

y 

X 

-y 

1 — X 

f r 

f m+n | 

l 2 ) 


r (?) r (l) 



which we recognize as a product of a Beta distribution in the variable x and a chi-squared 
distribution with m + n degrees of freedom in the variable y (as expected for the sum of 
two independent chi-square variables). 


4.3 Characteristic Function 

The characteristic function of the Beta distribution may be expressed in terms of the 
confluent hypergeometric function (see section 43.3) as 

= M(p,p + q-,it ) 


4.4 Moments 

The expectation value, variance, third and fourth central moment are given by 


p + q 

_ pq _ 

(p + q)’ 2 (p + q + 1) 

_ 2 pq{q ~ p) _ 

(p + q) 3 (p + q + l)(p + q + 2) 

3pq(2(p + q) 2 + pq(p + q — 6)) 

{p + q) 4 (p + q + l)(p + q + 2)(p + q + 3) 

More generally algebraic moments are given in terms of the Beta function by 

/ _ B(p + k,q) 

^ B(p,q) 


E{x) 

V(x) 


4.5 Probability Content 


In order to find the probability content for a Beta distribution we form the cumulative 
distribution 


F(x) 


1 

B(p,q) 


J t p ~\l -ty^dt 

o 


B x (p, q) 
B(p,q ) 


4 (p,q) 
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where both B x and I x seems to be called the incomplete Beta function in the literature. 

The incomplete Beta function I x is connected to the binomial distribution for integer 
values of a by 

1 - 4(o, b) = h. x (b,a) = (1 - + \ ~ ^ (p^)' 

or expressed in the opposite direction 

it - P) n ~ S = I P {a, n-a+1) 

Also to the negative binomial distribution there is a connection by the relation 

+ S tp n q s = I q (a,n) 

s=a \ S / 

The incomplete Beta function is also connected to the probability content of Student’s 
^-distribution and the F-distribution. See further section 42.7 for more information on I x . 

4.6 Random Number Generation 

In order to obtain random numbers from a Beta distribution we first single out a few special 
cases. 

For p = 1 and/or q — 1 we may easily solve the equation F{x) = £ where F(x) is the 
cumulative function and £ a uniform random number between zero and one. In these cases 

p — 1 =t x — 1 — £ 1 / 9 
q = 1 =>• x = £ 1//p 

For p and q half-integers we may use the relation to the chi-square distribution by 
forming the ratio 

Vm 

Vm T Vn 

with y m and y n two independent random numbers from chi-square distributions with m = 
2 p and n = 2q degrees of freedom, respectively. 

Yet another way of obtaining random numbers from a Beta distribution valid when p 
and q are both integers is to take the Tth out of k (1 < i < k) independent uniform random 
numbers between zero and one (sorted in ascending order). Doing this we obtain a Beta 
distribution with parameters p = i and q = k + 1 — t. Conversely, if we want to generate 
random numbers from a Beta distribution with integer parameters p and q we could use 
this technique with i = p and k = p + q — 1. This last technique implies that for low integer 
values of p and q simple code may be used, e.g. for p = 2 and q — 1 we may simply take 
max(^i,^) he. the maximum of two uniform random numbers. 
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5 Binomial Distribution 

5.1 Introduction 

The Binomial distribution is given by 

p(r;N,p) = (^Jp r (l~p) N ~ r 

where the variable r with 0 < r < N and the parameter N (N > 0) are integers and the 
parameter p (0 < p < 1 ) is a real quantity. 

The distribution describes the probability of exactly r successes in N trials if the prob¬ 
ability of a success in a single trial is p (we sometimes also use q = 1 — p, the probability 
for a failure, for convenience). It was first presented by Jacques Bernoulli in a work which 
was posthumously published [24], 


5.2 Moments 

The expectation value, variance, third and fourth moment are given by 


E{r ) = Np 

V(r) = Np{l—p) — Npq 

p 3 = Np(l — p)(l — 2p) = Npq(q — p) 

p 4 = Np( 1 — p) [1 + 3p(l — p)(N — 2)] = Npq [1 + 3 pq(N — 2)] 


Central moments of higher orders may be obtained by the recursive formula 

dp r \ 


Pr+i = PQ { Nrp r _i + 


dp 


starting with p 0 = 1 and pi — 0 . 

The coefficients of skewness and kurtosis are given by 


= q-p 
71 VNpq 


and 72 


1 — 6 pq 
Npq 


5.3 Probability Generating Function 

The probability generating function is given by 

N //v\ 

G(z) = E(z r ) = EyJ r b r (i -pp- r = (pz + ‘l) N 
and the characteristic function thus by 

(p(t) = G(e lt ) = [q+pe lt ) 
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5.4 Cumulative Function 

For fixed N and p one may easily construct the cumulative function P(r) by a recursive 
formula, see section on random numbers below. 

However, an interesting and useful relation exist between P(r) and the incomplete Beta 
function I x namely 

k 

p(k ) = £p(r; N, P ) = h_ p (N - k, k + 1) 

r=0 

For further information on I x see section 42.7. 


5.5 Random Number Generation 


In order to achieve random numbers from a binomial distribution we may either 

• Generate N uniform random numbers and accumulate the number of such that are 
less or equal to p, or 


• Use the cumulative technique, i.e. construct the cumulative, distribution, function 
and by use of this and one uniform random number obtain the required random 
number, or 


• for larger values of N, say N > 100, use an approximation to the normal distribution 
with mean Np and variance Npq. 


Except for very small values of N and very high values of p the cumulative technique is the 
fastest for numerical calculations. This is especially true if we proceed by constructing the 
cumulative vector once for all 2 (as opposed to making this at each call) using the recursive 
formula 


P(i) =p{i ~ !) 


p N + 1 — i 
q i 


for i — 1,2,..., N starting with p(0) = q N . 


However, using the relation given in the previous section with a well optimized code 
for the incomplete Beta function (see [10] or section 42.7) turns out to be a numerically 
more stable way of creating the cumulative distribution than a simple loop adding up the 
individual probabilities. 


5.6 Estimation of Parameters 

Experimentally the quantity jt. the relative number of successes in N trials, often is of more 
interest than r itself. This variable has expectation E(j^) = p and variance F(j^) = 22 . 
The estimated value for p in an experiment giving r successes in N trials is p — 

If p is unknown a unbiased estimate of the variance of a binomial distribution is given 

v ^ = wU N {j,) = 

2 This is possible only if we require random numbers from one and the same binomial distribution with 
fixed values of N and p. 
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To find lower and upper confidence levels for p we proceed as follows. 

• For lower limits find a pi ow such that 

N //V\ 

£ jPlwi 1 -Piow) N ~ r = 1 - a 

or expressed in terms of the incomplete Beta function 1 — Ii_ p (N — k + l,k) = 1 — a 

• for upper limits find a p up such that 

5Z (^jPupi 1 ~ Pu P ) N ~ r = 1 - « 

r =0 V r / 

which is equivalent to Ii_ p (N — k, k + 1) = 1 — a i.e. I p (k + 1, N — k) = a. 

As an example we take an experiment with N = 10 where a certain number of successes 
0 < k < N have been observed. The confidence levels corresponding to 90%, 95%, 99% 
as well as the levels corresponding to one, two and three standard deviations for a normal 
distribution (84.13%, 97.72% and 99.87% probability content) are given below. 



Lower confidence levels 


Upper confidence levels 

k 

— 3(7 

99% 

— 2(7 

95% 

90% 

—a 

P 

—a 

90% 

95% 

—2cr 

99% 

— 3(7 

0 







0.00 

0.17 

0.21 

0.26 

0.31 

0.37 

0.48 

1 

0.00 

0.00 

0.00 

0.01 

0.01 

0.02 

0.10 

0.29 

0.34 

0.39 

0.45 

0.50 

0.61 

2 

0.01 

0.02 

0.02 

0.04 

0.05 

0.07 

0.20 

0.41 

0.45 

0.51 

0.56 

0.61 

0.71 

3 

0.02 

0.05 

0.06 

0.09 

0.12 

0.14 

0.30 

0.51 

0.55 

0.61 

0.66 

0.70 

0.79 

4 

0.05 

0.09 

0.12 

0.15 

0.19 

0.22 

0.40 

0.60 

0.65 

0.70 

0.74 

0.78 

0.85 

5 

0.10 

0.15 

0.18 

0.22 

0.27 

0.30 

0.50 

0.70 

0.73 

0.78 

0.82 

0.85 

0.90 

6 

0.15 

0.22 

0.26 

0.30 

0.35 

0.40 

0.60 

0.78 

0.81 

0.85 

0.88 

0.91 

0.95 

7 

0.21 

0.30 

0.34 

0.39 

0.45 

0.49 

0.70 

0.86 

0.88 

0.91 

0.94 

0.95 

0.98 

8 

0.29 

0.39 

0.44 

0.49 

0.55 

0.59 

0.80 

0.93 

0.95 

0.96 

0.98 

0.98 

0.99 

9 

0.39 

0.50 

0.55 

0.61 

0.66 

0.71 

0.90 

0.98 

0.99 

0.99 

1.00 

1.00 

1.00 

10 

0.52 

0.63 

0.69 

0.74 

0.79 

0.83 

1.00 








5.7 Probability Content 

It is sometimes of interest to judge the significance level of a certain outcome given the 
hypothesis that p—\- If N trials are made and we find k successes (let’s say k < N/2 else 
use N — k instead of k) we want to estimate the probability to have k or fewer successes 
plus the probability for N — k oy more successes. Since the assumption is that p — \ we 
want the two-tailed probability content. 

To calculate this either sum the individual probabilities or use the relation to the in¬ 
complete beta function. The former may seem more straightforward but the latter may be 
computationally easier given a routine for the incomplete beta function. If k — N/2 we 
watch up not to add the central term twice (in this case the requested probability is 100% 
anyway). In the table below we show such confidence levels in % for values of N ranging 
from 1 to 20. E.g. the probability to observe 3 successes (or failures) or less and 12 failures 
(or successes) or more for n = 15 is 3.52%. 
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N 

0 

1 

2 

3 

4 

5 

6 

7 

8 


1 

100.oo 









2 

50.oo 

100.oo 








3 

25.00 

100.oo 








4 

12.50 

62.50 

100.oo 







5 

6.25 

37.50 

100.oo 







6 

3.13 

21.88 

68.75 

100.oo 






7 

1.56 

12.50 

45.31 

100.oo 






8 

0.78 

7.03 

28.91 

72.66 

100.00 





9 

0.39 

3.91 

17.97 

50.78 

100.oo 





10 

0.20 

2.15 

10.94 

34.38 

75.39 

100.oo 




11 

O.io 

1.17 

6.54 

22.66 

54.88 

100.oo 




12 

0.05 

0.63 

3.86 

14.60 

38.77 

77.44 

100.oo 



13 

0.02 

0.34 

2.25 

9.23 

26.68 

58.il 

100.oo 



14 

O.oi 

0.18 

1.29 

5.74 

17.96 

42.40 

79.05 

100.oo 


15 

O.oi 

0.10 

0.74 

3.52 

11.85 

30.18 

60.72 

100.oo 


16 

O.oo 

0.05 

0.42 

2.13 

7.68 

21.01 

45.45 

80.36 

100.oo 


17 

O.oo 

0.03 

0.23 

1.27 

4.90 

14.35 

33.23 

62.91 

100.oo 


18 

O.oo 

O.oi 

0.13 

0.75 

3.09 

9.63 

23.79 

48.07 

81.45 

100.oo 

19 

O.oo 

O.oi 

0.07 

0.44 

1.92 

6.36 

16.71 

35.93 

64.76 

100.oo 

20 

O.oo 

O.oo 

0.04 

0.26 

1.18 

4.14 

11.53 

26.32 

50.34 

82.38 100.00 





6 Binormal Distribution 

6.1 Introduction 

As a generalization of the normal or Gauss distribution to two dimensions we define the 
binormal distribution as 


f(xi,x 2 ) 


1 

27r<Ticr 2 \/l - P 2 



(( a ™) 2 +(^) 2 -^“ 


CT 2 


where /ii and /i 2 are the expectation values of x\ and x 2 , cri and <r 2 their standard deviations 
and p the correlation coefficient between them. Putting p = 0 we see that the distribution 
becomes the product of two one-dimensional Gauss distributions. 

4 

3 

2 

1 

0 

-1 

-2 

-3 

-4 

X\ 



I_I_I_I_I_L 


Figure 2: Binormal distribution 

In figure 2 we show contours for a standardized Binormal distribution i.e putting pi = 
p >2 = 0 and ai = cr 2 = 1 (these parameters are anyway shift- and scale-parameters only). 
In the example shown p = 0.5. Using standardized variables the contours range from a 
perfect circle for p = 0 to gradually thinner ellipses in the ±45° direction as p —> ±1. 
The contours shown correspond to the one, two, and three standard deviation levels. See 
section on probability content below for details. 
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6.2 Conditional Probability Density 

The conditional density of the binormal distribution is given by 


fiAv) = 


f(x,y)/f{y) = 

1 

\f2ncr x y/l - p 2 


exp 


2al(\ - p2) 


•E ( t^x (jj H"y) 


(Jq, 


a, 


= N[p x + p— (y - p y ) : a x {l- p ) 


a 


which is seen to be a normal distribution which for p = 0 is, as expected, given by N(p x , 
but generally has a mean shifted from p x and a variance which is smaller than a x . 


6.3 Characteristic Function 

The characteristic function of the binormal distribution is given by 


= E(e 


ltiXi+lt2X2\ _ 


JltiX\+lt2X2 


f{xi,x 2 )dxidx2 = 


— CO —OO 


= exp 


[itiHi + it 2 p 2 + | {iti) 2 al + (• it 2 ) 2 al + 2 {it l ){it 2 )pa 1 a 2 | 


which shows that if the correlation coefficient p is zero then the characteristic function 
factorizes i.e. the variables are independent. This is a unique property of the normal 
distribution since in general p = 0 does not imply independence. 


6.4 Moments 

To find bivariate moments of the binormal distribution the simplest, but still quite tedious, 
way is to use the characteristic function given above (see section 2.7.3). 

Algebraic bivariate moments for the binormal distribution becomes somewhat compli¬ 
cated but normally they are of less interest than the central ones. Algebraic moments of 
the type p' ok and p! k0 are, of course, equal to moments of the marginal one-dimensional 
normal distribution e.g. p! w = //1, p' 20 = p 2 + af, and p' 30 = pi(2a 2 + p\) (for p' ok simply 
exchange the subscripts on // and cr). Some other lower order algebraic bivariate moments 
are given by 


lA i = /i i p 2 + pa i (T'2 

Pi 2 — 2pc\<j 2 p 2 T <j 2 pi T p 2 l^i 

p! 22 = (T 2 a 2 + a\p g + a llA\ + lAlA + 2p 2 (7?(7| + Apaia 2 pip 2 

Beware of the somewhat confusing notation where p with two subscripts denotes bivariate 
moments while p with one subscript denotes expectation values. 

Lower order central bivariate moments pki, arranged in matrix form, are given by 
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^ = 0 

£ = 1 

1 = 2 

1 = 3 

t = 4 

k = 0 

1 

0 

^2 

0 

3u 2 4 

k = 1 

0 

paid 2 

0 

3p<Ji cr| 

0 

k = 2 


0 

u 2 a 2 (2p 2 + 1) 

0 

3o-^u|(4p 2 + 1) 

k = 3 

0 

3pof<7 2 

0 

Spc^crf (2p 2 + 3) 

0 

k = 4 

3<r 4 

0 

3^u 2 (4p 2 + 1) 

0 3<r 4 a 4 (8p 4 + 24p 2 + 3) 


6.5 Box-Muller Transformation 


Recall that if we have a distribution in one set of variables {aq, x 2 -, x n } and want to 
change variables to another set { 2 / 1 , 2 / 2 ,2/n} the distribution in the new variables are 
given by 


f (y 1 ^ Vn) 


dx\ 

dx\ 

dx\ 

dyi 

dy2 

dy n 

dx2 

dx 2 

9x2 

dyi 

dy2 

9y n 

dx n 

dx n 

dx n 

dyi 

dy2 

9y n 


f(xi,x 2 , —,x n ) 


where the symbol ||J|| denotes the absolute value of the determinant of the Jacobian J. 

Let X\ and x 2 be two independent stochastic variables from a uniform distribution 
between zero and one and define 


yi = \J —2 lnoq sin 27rx 2 
y 2 = y— 2 In x\ cos 2nx 2 


Note that with the definition above — 00 < y\ < 00 and —00 < p 2 < 00 . In order to 
obtain the joint probability density function in ij\ and y 2 we need to calculate the Jacobian 
matrix 

9(*1.*2) /g g 

Sfei.te) fg g 

In order to obtain these partial derivatives we express X\ and x 2 in y\ and y 2 by rewriting 
the original equations. 


which implies 


y\ + y\ = —2 In x\ 

— = tan 2nx 2 

V2 


Xl = e -W^yl) 

1 (Vi 

x 2 — — arctan — 

2tt \y 2 . 


Then the Jacobian matrix becomes 

d(xi,x 2 ) ( —y ie -h(yi+y^ 

<9(2/1, 2 / 2 ) ~~ V i cos2 arctan (|) 


-2/ 2e -§ (»?+«§) 

^ cos 2 arctan (—) 
\y-2J 
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The distribution /( 2 / 1 , 2 / 2 ) is given by 


/(2/i, 2 / 2 ) 


d(x 1 ,x 2 ) 

d(yi,y 2 ) 


f(x i,x 2 ) 


where f(xi,x 2 ) is the uniform distribution in X\ and x 2 . Now f(x\,x 2 ) = 1 in the interval 
0 < X\ < 1 and 0 < x 2 < 1 and zero outside this region, and the absolute value of the 
determinant of the Jacobian is 


<90i,x 2 ) 
<9(?/i, 1 / 2 ) 


—e 2 (vi+vz) (y± -y cos 2 arctan (—'^ 

2 tt \yl ) \y 2 ) 


but 


(+ 1 | cos 2 arctan (— | = (tan 2 2nxr> + 1) cos 2 2nx 2 = 1 

\yl ) \V2) 


and thus 


/ ( 2/15 2 / 2 ) 


^L e -m+yi) 

2vr 




i.e. the product of two standard normal distributions. 

Thus the result is that y\ and y 2 are distributed as two independent standard normal 
variables. This is a well known method, often called the Box-Muller transformation, used 
in order to achieve pseudorandom numbers from the standard normal distribution given 
a uniform pseudorandom number generator (see below). The method was introduced by 
G. E. P. Box and M. E. Muller [25]. 


6.6 Probability Content 

In figure 2 contours corresponding to one, two, and three standard deviations were shown. 
The projection on each axis for e.g. the one standard deviation contour covers the range 
— 1 < Xi < 1 and contains a probability content of 68.3% which is well known from the 
one-dimensional case. 

More generally, for a contour corresponding to z standard deviations the contour has 
the equation 

(x 1 + x 2 ) 2 (oq - x 2 ) 2 = 2 
1 + P 1 ~ P 

i.e. the major and minor semi-axes are z^Jl + p and Z\J 1 — p, respectively. The function 
value at the contour is given by 

/(ll ' l2) = 2y7rv7 exp rv} 

Expressed in polar coordinates (r, 0) the contour is described by 


1 — 2psin0cos0 

While the projected probability contents follow the usual figures for one-dimensional 
normal distributions the joint probability content within each ellipse is smaller. For the 
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one, two, and three standard deviation contours the probability content, regardless of the 
correlation coefficient p, inside the ellipse is approximately 39.3%, 86.5%, and 98.9%. If 
we would like to find the ellipse with a joint probability content of 68.3% we must chose 
£ ~ 1.5 (for a content of 95.5% use z ~ 2.5 and for 99.7% use z ~ 3.4). Se further discussion 
on probability content for a multinormal distribution in section 28.3. 


6.7 Random Number Generation 


The joint distribution of yx and p 2 in section 6.5 above is a binormal distribution having 
p = 0. For arbitrary correlation coefficients p the binormal distribution is given by 


f(x i,x 2 ) 


1 

27r<Ticr 2 A/l - p 2 


_ 1 

2(1—P 2 ) 

• e 


((“) 2 +(“)%“ 


^2~M2 

a 2 


where p\ and p 2 are the expectation values of x\ and :r 2 , a i and <r 2 their standard deviations 
and p the correlation coefficient between them. 

Variables distributed according to the binormal distribution may be obtained by trans¬ 
forming the two independent numbers yx and p 2 found in the section 6.5 either as 


zi = Mi+ o-i [yi \!1 - p 2 + y2p 

= M 2 + C 2V2 


or as 


-i = Mi + (mi\A + P + W 1 - p) 

-2 = M2 + ^| ( y\\j 1 + p- V2\jl ~ pj 

which can be proved by expressing y\ and p 2 as functions of z\ and z 2 and evaluate 

d(yi,V2) 


f(zi,z 2 ) = 


d(zi,z 2 ) 


/( 2 / 1 , 2 / 2 ) = 


dy 1 dy 1 
dz\ dz2 
dyi dyi 
dz\ dz2 


/(2/1, 2/2) 


In the first case 


Mi 

V2 


and in the second case 


£2 - M2 


Zi 


<7\ 


Mi ^2 
~ P~ 


M 2 


02 


02 


y 1 

V2 


V2 (z x - pi z 2 - M 2 \ 

2 a/1 + p V a\ a 2 ) 

V2 f zx ~ Pi _ z 2 - M 2 \ 

2a/1 - p V cxi 02 ) 
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In both cases the absolute value of the determinant of the Jacobian is 1 /cricr 2 \/l — p 2 and 
we get 


1 1 y\ 

f( z i,z 2 ) = - ■ .— e 2 

<ji(T2a/ 1 - p 2 \J2-k 


_y\_ 

-.e 2 = 


27raicr 2 A/l - p 2 


. e -|(2/?+2/|) 


Inserting the relations expressing rp and y 2 in and z 2 in the exponent we finally obtain 
the binormal distribution in both cases. 

Thus we have found methods which given two independent uniform pseudorandom num¬ 
bers between zero and one supplies us with a pair of numbers from a binormal distribution 
with arbitrary means, standard deviations and correlation coefficient. 
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7 Cauchy Distribution 

7.1 Introduction 

The Cauchy distribution is given by 


and is dehned for — oo < x < oo. It is a symmetric unimodal distribution as is shown in 
figure 3. 


0.4 

0.3 

0.2 

0.1 

0 



-4 ^2 0 2 4 

x 


Figure 3: Graph of the Cauchy distribution 

The distribution is named after the famous french mathematician Augustin Louis Cauchy 
(1789-1857) who was a professor at Ecole Polytechnique in Paris from 1816. He was one of 
the most productive mathematicians which have ever existed. 


7.2 Moments 

This probability density function is peculiar inasmuch as it has undefined expectation value 
and all higher moments diverge. For the expectation value the integral 


E(x) 



x 

1 + x 2 


dx 


is not completely convergent, i.e. 


lim - / 

a—>oo,b—>oo 77 J 


—a 


X 

1 + X 2 


dx 
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does not exist. However, the principal value 

lim - j -^—dx 
a 7 T J 1 + X 2 

—a 

does exist and is equal to zero. Anyway the convention is to regard the expectation value 
of the Cauchy distribution as undefined. 

Other measures of location and dispersion which are useful in the case of the Cauchy 
distribution is the median and the mode which are at x — 0 and the half-width at half¬ 
maximum which is 1 (half-maxima at x = ±1). 


7.3 Normalization 

In spite of the somewhat awkward property of not having any moments the distribution at 
least fulfil the normalization requirement for a proper probability density function i.e. 


where 


OO OO 

A f = [ f(x)dx = ~ [ 1 1 2 dx 
J n J 1 + x 2 

— OO — OO 

we have made the substitution tan 0 = 


7r/2 

1 r 1 d<p 

Ti J 1 + tan 2 0 cos 2 0 

— 7T/2 


x in order to simplify the integration. 


7.4 Characteristic Function 

The characteristic function for the Cauchy distribution is given by 


OO OO 

0(f) = J e ltx ffx)dx = - J 


/ 00 

1 / f cos fa: 


— OO 
0 


7 T \J 1 + X 2 
vO 

OO 

2 f cos tx 


dx + 


cos tx + % sin tx 
1 + x 2 

cos tx 


dx = 


1 + a: 2 


oo 0 

/' i sm tx [ i sin tx 

dx+ - - -dx + I -- -dx 


o 


1 + x 2 


1 + x 2 


71 J 1 + X 2 
0 


dx = e ^ 


where we have used that the two sine integrals are equal but with opposite sign whereas 
the two cosine integrals are equal. The final integral we have taken from standard integral 
tables. Note that the characteristic function has no derivatives at f = 0 once again telling 
us that the distribution has no moments. 


7.5 Location and Scale Parameters 

In the form given above the Cauchy distribution has no parameters. It is useful, however, 
to introduce location (x 0 ) and scale (T > 0) parameters writing 

^ _ 1 r 

x , %0i r) 9 , •. 2 

7 T 1 z + [X — Xq)~ 
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where x 0 is the mode of the distribution and T the half-width at half-maximum (HWHM). 
Including these two parameters the characteristic function is modified to 

<f>{t) = e itX0 ~ m 

7.6 Breit-Wigner Distribution 

In this last form we recognize the Breit-Wigner formula, named after the two physicists 
Gregory Breit and Eugene Wigner, which arises in physics e.g. in the description of the 
cross section dependence on energy (mass) for two-body resonance scattering. Resonances 
like e.g. the A ++ in n + p scattering or the p in im scattering can be quite well described 
in terms of the Cauchy distribution. This is the reason why the Cauchy distribution in 
physics often is referred to as the Breit-Wigner distribution. However, in more elaborate 
physics calculations the width may be energy-dependent in which case things become more 
complicated. 

7.7 Comparison to Other Distributions 

The Cauchy distribution is often compared to the normal (or Gaussian) distribution with 
mean p and standard deviation a > 0 



and the double-exponential distribution with mean p and slope parameter A > 0 

Hz; ft A) = 

These are also examples of symmetric unimodal distributions. The Cauchy distribution has 
longer tails than the double-exponential distribution which in turn has longer tails than 
the normal distribution. In figure 4 we compare the Cauchy distribution with the standard 
normal (p — 0 and a — 1) and the double-exponential distributions (A = 1) for x > 0. 

The normal and double-exponential distributions have well defined moments. Since 
they are symmetric all central moments of odd order vanish while central moments of even 
order are given by p2 n = (2n)\cr 2n /2 n n\ (for n > 0) for the normal and by p n = n\/X n (for 
even n) for the double-exponential distribution. E.g. the variances are cr 2 and 2/A 2 and the 
fourth central moments 3cr 4 and 24/A 4 , respectively. 

The Cauchy distribution is related to Student’s ^-distribution with n degrees of freedom 
(with n a positive integer) 







where T(x) is the Euler gamma-function not no be mixed up with the width parameter for 
the Cauchy distribution used elsewhere in this section. B is the beta-function defined in 
terms of the T-function as B(p, q ) = As can be seen the Cauchy distribution arises 







Figure 4: Comparison between the Cauchy distribution, the standard normal distribution, 
and the double-exponential distribution 


as the special case where n — 1. If we change variable to x = t/\Jn and put m = the 
Student’s /-distribution becomes 


f(x; m) = 


(i + x 2 y 


r (m) 1 


where k is simply a normalization constant. Here it is easier to see the more general form 
of this distribution which for m — 1 gives the Cauchy distribution. The requirement n > 1 
corresponds to m being a half-integer > 1 but we could even allow for m being a real 
number. 

As for the Cauchy distribution the Student’s /-distribution have problems with divergent 
moments and moments of order > n does not exist. Below this limit odd central moments 
are zero (the distribution is symmetric) and even central moments are given by 



for r a positive integer (2 r < n). More specifically the expectation value is E(t) = 0, the 
variance V(t) = and the fourth central moment is given by /i 4 = when they 

exist. As n —► oo the Student’s /-distribution approaches a standard normal distribution. 


7.8 Truncation 

In order to avoid the long tails of the distribution one sometimes introduces a truncation. 
This, of course, also cures the problem with the undefined mean and divergent higher 
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moments. For a symmetric truncation —X < x < X we obtain the renormalized probability 
density function 

f( x ) — _ 1 _._ 1 _ 

2 arctan X 1 + x 2 

which has expectation value E(x) = 0, variance V(x) = ^ v — 1, third central moment 

^3 = 0 and fourth central moment fi± = , u . ct ^ u v [ V — 1^+1. The fraction of the original 
Cauchy distribution within the symmetric interval is / = ^ arctan X. We will, however, 
not make any truncation of the Cauchy distribution in the considerations made in this 
note. 


7.9 Sum and Average of Cauchy Variables 

In most cases one would expect the sum and average of many variables drawn from the 
same population to approach a normal distribution. This follows from the famous Central 
Limit Theorem. However, due to the divergent variance of the Cauchy distribution the 
requirements for this theorem to hold is not fulfilled and thus this is not the case here. We 
define 

n — 1 
S n ^ ) %i &Tld S n S n 

u n 

with Xi independent variables from a Cauchy distribution. 

The characteristic function of a sum of independent random variables is equal to the 
product of the individual characteristic functions and hence 

$(t) = (j){t) n = e -”'*1 


for S n . Turning this into a probability density function we get (putting x = S n for conve¬ 
nience) 


CXJ CXJ / U CXJ 

f(x) = ^ J <$>(t)e~ lxt dt = ^ J e~( lxt+nW) dt = I J e nt ~ lxt di + J e 


—ixt—nt 


dt = 


—oo 


—oo 


l 

2tt 


D t(n—ix) 


n — ix 


+ 


t(ix+n) 


-n — ix 


1 ( 1 1 \ _ 1 n 

27 t \n — ix n + ixj n n 2 + x 2 


This we recognize as a Cauchy distribution with scale parameter T = n and thus for each 
additional Cauchy variable the HWHM increases by one unit. 

Moreover, the probability density function of S n is given by 


f(Sn) 


d ^f(S n ) 

CL O r) 


1 1 


i. e. the somewhat amazing result is that the average of any number of independent random 
variables from a Cauchy distribution is also distributed according to the Cauchy distribu¬ 
tion. 
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7.10 Estimation of the Median 

For the Cauchy distribution the sample mean is not a consistent estimator of the median 
of the distribution. In fact, as we saw in the previous section, the sample mean is itself 
distributed according to the Cauchy distribution and therefore has divergent variance. 
However, the sample median for a sample of n independent observations from a Cauchy 
distribution is a consistent estimator of the true median. 

In the table below we give the expectations and variances of the sample mean and 
sample median estimators for the normal, double-exponential and Cauchy distributions 
(see above for definitions of distributions). Sorting all the observations the median is taken 
as the value for the central observation for odd n and as the average of the two central 
values for even n. The variance of the sample mean is simply the variance of the distribution 
divided by the sample size n. For large n the variance of the sample median m is given by 
V(m) = 1/4 nf 2 where / is the function value at the median. 


Distribution 

E(x) 

V(x) 

E(m ) 

V(m) 

Normal 


a/ 

n 


2 

TUT 

2 n 

Double-exponential 


2 

nX 2 


1 

nX 2 

Cauchy 

undef. 

OO 

x 0 

7r 2 r 2 

4 n 


For a normal distribution the sample mean is superior to the median as an estimator of 
the mean (he. it has the smaller variance). However, the double-exponential distribution 
is an example of a distribution where the sample median is the best estimator of the mean 
of the distribution. In the case of the Cauchy distribution only the median works of the 
above alternatives but even better is a proper Maximum Likelihood estimator. In the case 
of the normal and double-exponential the mean and median, respectively, are identical to 
the maximum likelihood estimators but for the Cauchy distribution such an estimator may 
not be expressed in a simple way. 

The large n approximation for the variance of the sample median gives conservative 
estimates for lower values of n in the case of the normal distribution. Beware, however, 
that for the Cauchy and the double-exponential distributions it is not conservative but 
gives too small values. Calculating the standard deviation this is within 10% of the true 
value already at n — 5 for the normal distribution whereas for the Cauchy distribution this 
is true at about n = 20 and for the double-exponential distribution only at about n — 60. 

7.11 Estimation of the HWHM 

To find an estimator for the half-width at half-maximum is not trivial. It implies binning 
the data, finding the maximum and then locating the positions where the curve is at half- 
maximum. Often it is preferable to fit the probability density function to the observations 
in such a case. 
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However, it turns out that another measure of dispersion the so called semi-interquar- 
tile range can be used as an estimator. The semi-interquartile range is defined as half the 
difference between the upper and the lower quartiles. The quartiles are the values which 
divide the probability density function into four parts with equal probability content, i.e. 
25% each. The second quartile is thus identical to the median. The definition is thus 

£ = -(Q3 — Qi) — r 

where Q\ is the lower and Q 3 the upper quartile which for the Cauchy distribution is equal 
to Xq — T and x 0 + T, respectively. As is seen S = HWHM = T and thus this estimator 
may be used in order to estimate T. 

We gave above the large n approximation for the variance of the median. The median 
and the quartiles are examples of the more general concept quantiles. Generally the large 
n approximation for the variance of a quantile Q is given by V(Q) = pq/nf 2 where / is the 
ordinate at the quantile and p and q = 1 — p are the probability contents above and below 
the quantile, respectively. The covariance between two quantiles Q\ and Q 2 is, with similar 
notations, given by Cov(Q 1 , Q 2 ) = p 2 q\jnf\ f 2 where Q\ should be the leftmost quantile. 

For large n the variance of the semi-interquartile range for a sample of size n is thus 
found by error propagation inserting the formulae above 


V(S) = 1 (V(Qi) + V(Q S ) ~ 2 Cov(QuQs)) = 



7r 


2r2 


/ 1 / 3 ) lGnff An 

where fi and / 3 are the function values at the lower and upper quartile which are both 
equal to l/2nT. This turns out to be exactly the same as the variance we found for the 
median in the previous section. 

After sorting the sample the quartiles are determined by extrapolation between the two 
observations closest to the quartile. In the case where n + 2 is a multiple of 4 i. e. the series 
n = 2 , 6 , 10 ... the lower quartile is exactly at the CjMth observation and the upper quartile 
at the ^pbth observation. In the table below we give the expectations and variances of 
the estimator of S as well as the variance estimator s 2 for the normal, double-exponential 
and Cauchy distributions. The variance estimator s 2 and its variance are given by 


s 2 = 


1 


71 — 1 


^2(xi — x) 2 and V(s 2 ) = 


i= 1 


h4 - h 2 
n 


+ 


2hl 


n(n — 1) 


with /i 2 and the second and fourth central moments. The expectation value of s 2 is 
equal to the variance and thus it is a unbiased estimator. 


Distribution 

HWHM 

E(s 2 ) 

V(s 2 ) 

E(S) 

V(S) 

Normal 

Double-exponential 

Cauchy 

cry/ 2 In 2 

In 2 

A 

r 

CM ^- |(M Q 

b 0 

2 CT 4 

n— 1 

3 a 

00 

0.6745(7 

ln 2 

A 

r 

1 

16 n/iQO 2 

1 

n\ 2 

7r 2 r 2 

4 n 
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In this table a = 1 + if we include the second term in the expression of V ( s 2 ) above 
and a = 1 otherwise. It can be seen that the double-exponential distribution also has 
HWHM = S but for the normal distribution HWHM « 1.1774a as compared to S ~ 
0.6745a. 

For the three distributions tested the semi-interquartile range estimator is biased. In 
the case of the normal distribution the values are approaching the true value from below 
while for the Cauchy and double-exponential distributions from above. The large n ap¬ 
proximation for V(S) is conservative for the normal and double-exponential distribution 
but not conservative for the Cauchy distribution. In the latter case the standard deviation 
of S is within 10% of the true value for n > 50 but for small values of n it is substantially 
larger than given by the formula. The estimated value for T is less than 10% too big for 
n > 25. 


7.12 Random Number Generation 


In order to generate pseudorandom numbers from a Cauchy distribution we may solve the 
equation F(x ) = £ where F(x) is the cumulative distribution function and £ is a uniform 
pseudorandom number between 0 and 1. This means solving for x in the equation 


F(x) 



P + (t - x«f dt 


{ 


If we make the substitution tan (ft — (t — Xq)/Y using that d<j>/ cos 2 (ft = dt/Y we obtain 

\ arctan + \ = £ 

which finally gives 

x = x 0 + T tan ^7r (£ — ^)) 

as a pseudorandom number from a Cauchy distribution. One may easily see that it is 
equivalent to use 

x = Xq + T tan(27r£) 

which is a somewhat simpler expression. 

An alternative method (see also below) to achieve random numbers from a Cauchy 
distribution would be to use 

j r Zl 

X = Xo + 1 — 

^2 

where Z\ and are two independent random numbers from a standard normal distribution. 
However, if the standard normal random numbers are achieved through the Box-Muller 
transformation then Z\/ z -2 = tan27r£ and we are back to the previous method. 

In generating pseudorandom numbers one may, if profitable, avoid the tangent by 

a Generate in u and v two random numbers from a uniform distribution between -1 
and 1. 

b If u 2 + v 2 > 1 (outside circle with radius one in un-plane) go back to a. 
c Obtain x = Xq + T- as a random number from a Cauchy distribution. 
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7.13 Physical Picture 

A physical picture giving rise to the Cauchy distribution is as follows: Regard a plane in 
which there is a point source which emits particles isotropically in the plane (either in the 
full 27r region or in one hemisphere 7r radians wide). The source is at the x-coordinate Xq 
and the particles are detected in a detector extending along a line T length units from the 
source. This scenario is depicted in figure 5 


SOURCE 



Figure 5: Physical scenario leading to a Cauchy distribution 


The distribution in the variable x along the detector will then follow the Cauchy distri¬ 
bution. As can be seen by pure geometrical considerations this is in accordance with the 
result above where pseudorandom numbers from a Cauchy distribution could be obtained 
by x = Xq + T tan 0, i.e. tan 0 = x 0 10 , with 0 uniformly distributed between — | and |. 

To prove this let us start with the distribution in 0 

/(0) = 1 for - ^ < 0 < ^ 

7T A A 

To change variables from 0 to x requires the derivative dcft/dx which is given by 

d(f) COS 2 0 1 9 (X — Xq\ 

— = ——— = — cos- arctan ——— 

dx r r v r ) 

Note that the interval from to | in 0 maps onto the interval — oo < x < oo. We get 


f ( x ) = 


f (0) = — cos 2 arctan ( 
7tT V 


dcj) 
dx 

i r 

7tT ' r 2 + (x 


X — Xo 

r 

r 


= —— COS (D = 


ttT 


X 0 ) 2 7T T 2 + (x-X 0 ) S 


i.e. the Cauchy distribution. 

It is just as easy to make the proof in the reversed direction, i.e. given a Cauchy 
distribution in x one may show that the 0-distribution is uniform between — | and |. 
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7.14 Ratio Between Two Standard Normal Variables 


As mentioned above the Cauchy distribution also arises if we take the ratio between two 
standard normal variables z\ and 22 , viz. 

, r Zl 

x = x 0 + 1 — . 

£2 

In order to deduce the distribution in x we first introduce a dummy variable y which we 
simply take as 22 itself. We then make a change of variables from 21 and 22 to x and y. 
The transformation is given by 

, r Zl 

x = x 0 + 1 — 

£2 

y = z 2 


or if we express z\ and z 2 in x and y 

Zl 

Z 2 


y(x-x 0 )/T 

y 


The distribution in x and y is given by 


f(x,y) 


d(2i, 2 2 ) 

d(x,y) 


f(zi,z 2 ) 


where the absolute value of the determinant of the Jacobian is equal to y/T and / ( 21 , 22 ) 
is the product of two independent standard normal distributions. We get 


f(x,y) = 


= l.± e - 1 M+4) = JL e "* 


K 


y2(x - 2 x 0)2 +v 2 


T 2vr 2 t rT 

In order to obtain the marginal distribution in x we integrate over y 


CXJ 

fix) = J f(x, y)dy 



ye~ ay2 dy 


where we have put 



for convenience. If we make the substitution 2 = 




y 2 we get 


dz 1 
2 2-kToi 


0 

Note that the first factor of 2 comes from the fact that the region — 00 < y < 00 maps 
twice onto the region 0 < 2 < 00 . Finally 


f( x ) = = — ___ 

2ixTa 2i rT 

i.e. a Cauchy distribution. 


1 r 

71 (x — xf) 2 + r 2 
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8 Chi-square Distribution 

8.1 Introduction 

The chi-square distribution is given by 





_ x_ 

e 2 



where the variable x > 0 and the parameter n, the number of degrees of freedom, is a 
positive integer. In figure 6 the distribution is shown for n -values of 1, 2, 5 and 10. For 
n > 2 the distribution has a maximum at n— 2. 



Figure 6: Graph of chi-square distribution for some values of n 


8.2 Moments 

Algebraic moments of order k are given by 


A = E(x k ) = 


2 r(f) 


x k | - 


x \ 2 1 _*£ 2 

e 2 dx = 
n ' 


2 k T (% + k 


r (?) 


y?- 1+k e- v dy = 


r (?) 


k n .n _ s ,n . _ s , n 


= 2* • -(- + 1) • • • (- + k -2)(- + k - 1) = n(n + 2)(n + 4) • • • (n + 2k - 2) 

e.g. the first algebraic moment which is the expectation value is equal to n. A recursive 
formula to calculate algebraic moments is thus given by 

/4 = /4-i • ( n + 2k ~ 2 ) 

where we may start with fi' Q = 1 to find the expectation value ji\ = n, n' 2 = n(n + 1) etc. 
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From this we may calculate the central moments which for the lowest orders become 
[i 2 = 2 n, /13 = 8 n, /i 4 = 12 n(n + 4), /x 5 = 32n(5n + 12) and /ig = 40n(3n 2 + 52n + 96). 
The coefficients of skewness and kurtosis thus becomes 71 = 2J2/n and 72 = 12/n. 


The fact that the expectation value of a chi-square distribution equals the number of 
degrees of freedom has led to a bad habit to give the ratio between a found chi-square 
value and the number of degrees of freedom. This is, however, not a very good variable 
and it may be misleading. We strongly recommend that one always should give both the 
chi-square value and degrees of freedom e.g. as y 2 /n.d.f.=9.7/5. 

To judge the quality of the fit we want a better measure. Since the exact sampling 
distribution is known one should stick to the chi-square probability as calculated from an 
integral of the tail i.e. given a specific chi-square value for a certain number of degrees of 
freedom we integrate from this value to infinity (see below). 

As an illustration we show in figure 7 the chi-square probability for constant ratios of 
X 2 /n.d.f. 




Figure 7: Chi-square probability for constant ratios of y 2 /n.d.f. 

Note e.g. that for few degrees of freedom we may have an acceptable chi-square value 
even for larger ratios. 


8.3 Characteristic Function 


The characteristic function for a chi-square distribution with n degrees of freedom is given 
by 




E(e ltx ) 




e~d ~^ x dx 


uu 77,-1 

1 [( y y c -v d y _ 

2r ( t )y U-W \-it 
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n 
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8.4 Cumulative Function 


The cumulative, or distribution, function for a chi-square distribution with n degrees of 
freedom is given by 


F(x) 



x 


i-i 


e 2 dx 


P ( 

,2 2 



a; 


2 



o 


y 2 x e y 2<P/ 


where P , |J is the incomplete Gamma function (see section 42.5). In this calculation 
we have made the simple substitution y — x/2 in simplifying the integral. 


8.5 Origin of the Chi-square Distribution 

n 

If zi, Z 2 , ..., z n are n independent standard normal random variables then X) zf is distributed 

2—1 

as a chi-square variable with n degrees of freedom. 

In order to prove this first regard the characteristic function for the square of a standard 
normal variable 

OO OO 

P ltz2 ) = 1 [ P~4( 1 ~ 2lt )d- = 1 [ P-4 _ C v / _ = _ 1 _ 

— OO — OO 

where we made the substitution y = z\J 1 — 2 it. 

For a sum of n such independent variables the characteristic function is then given by 

<j>(t) = (1 - 2it)”? 

which we recognize as the characteristic function for a chi-square distribution with n degrees 
of freedom. 

This property implies that if x and y are independently distributed according to the chi- 
square distribution with n and m degrees of freedom, respectively, then x + y is distributed 
as a chi-square variable with m + n degrees of freedom. 

Indeed the requirement that all z's come from a standard normal distribution is more 
than what is needed. The result is the same if all observations Xi come from different normal 
populations with means fjLi and variance of if we in each case calculate a standardized 
variable by subtracting the mean and dividing with the standard deviation i.e. taking 
Zi = (;Xi - Ah)M ■ 
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8.6 Approximations 

For large number of degrees of freedom n the chi-square distribution may be approximated 
by a normal distribution. There are at least three different approximations. Firstly we 
may naively construct a standardized variable 

x — E{x) x — n 

sjV(x) V^l 

which would tend to normality as n increases. Secondly an approximation, due to R. A. Fisher, 
is that the quantity 

Z 2 = \Plx — y/ 2 n — 1 

approaches a standard normal distribution faster than the standardized variable. Thirdly 
a transformation, due to E. B. Wilson and M. M. Hilferty, is that the cubic root of x/n is 
closely distributed as a standard normal distribution using 

. COM 1 -*) 

-/x- 

V 9n 

The second approximation is probably the most well known but the latter is approaching 
normality even faster. In fact there are even correction factors which may be applied to Z 3 
to give an even more accurate approximation (see e.g. [26]) 

, 60 , 

Z 4 — Z 3 + h n — £3 H- h 60 

n 

with /i6o given for values of Z 2 from -3.5 to 3.5 in steps of 0.5 (in this order the values of 
h m are -0.0118, -0.0067, -0.0033, -0.0010, 0.0001, 0.0006, 0.0006, 0.0002, -0.0003, -0.0006, 
-0.0005, 0.0002, 0.0017, 0.0043, and 0.0082). 

To compare the quality of all these approximations we calculate the maximum deviation 
between the cumulative function for the true chi-square distribution and each of these 
approximations for n = 30 and n = 100. The results are shown in the table below. Normally 
one accepts Z 2 for n > 100 while z 3 , and certainly Z 4 , are even better already for n > 30. 


Approximation 

n = 30 

n = 100 

Zl 

0.034 

0.019 

Z 2 

0.0085 

0.0047 

Z 3 

0.00039 

0.00011 

Z 4 

0.000044 

0.000035 


8.7 Random Number Generation 

As we saw above the sum of n independent standard normal random variables gave a 
chi-square distribution with n degrees of freedom. This may be used as a technique to 
produce pseudorandom numbers from a chi-square distribution. This required a generator 
for standard normal random numbers and may be quite slow. However, if we make use of 
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the Box-Mullcr transformation in order to obtain the standard normal random numbers 
we may simplify the calculations. 

First we recall the Box-Mullcr transformation which given two pseudorandom numbers 
uniformly distributed between zero and one through the transformation 

Z\ = \j— 2 In cos 27 t£ 2 

z 2 = \J~ 2 In 6 sin 2 it£ 2 

gives, in Zi and z 2 , two independent pseudorandom numbers from a standard normal dis¬ 
tribution. 

Adding n such squared random numbers implies that 

y 2k = -2 Info • 6 • • • &) 

V 2 k+i = -2 Info • 6 • • • &) - 2 In ffoi cos 2 2 n^ k+2 

for k a positive integer will be distributed as chi-square variable with even or odd number 
of degrees of freedom. In this manner a lot of unnecessary operations are avoided. 

Since the chi-square distribution is a special case of the Gamma distribution we may 
also use a generator for this distribution. 


8.8 Confidence Intervals for the Variance 

If Xi, x 2 , ...,x n are independent normal random variables from a N(y, a 2 ) distribution then 
is distributed according to the chi-square distribution with n — 1 degrees of freedom. 
A 1 — a confidence interval for the variance is then given by 


(n — l)s 2 

-t'l— a/2,n— 1 


< CT 2 < 


(n — l)s 2 

^(■a/2,n—l 


where Xa,n is the chi-square value for a distribution with n degrees of freedom for which the 
probability to be greater or equal to this value is given by a. See also below for calculations 
of the probability content of the chi-square distribution. 


8.9 Hypothesis Testing 

Let x\, x 2 ,x n be n independent normal random variables distributed according to a 
N(n, a 2 ) distribution. To test the null hypothesis H 0 : a 2 = ctq versus H\\ a 2 ^ <7 q at the a 
level of significance, we would reject the null hypothesis if (n — l)s 2 /a^ is less than X 2 a / 2n -i 
or greater than xl- a / 2 ,n-v 

8.10 Probability Content 

In testing hypotheses using the chi-square distribution we define x a = Xa n from 

X a 

F(x a ) = J f(x;n)dx = 1 — a 
o 
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i.e. a is the probability that a variable distributed according to the chi-square distribution 
with n degrees of freedom exceeds x a . 

This formula can be used in order to determine confidence levels for certain values of a. 
This is what is done in producing the tables which is common in all statistics text-books. 
However, more often the equation is used in order to calculate the confidence level a given 
an experimentally determined chi-square value x a . 

In calculating the probability content of a chi-square distribution we differ between the 
case with even and odd number of degrees of freedom. This is described in the two following 
subsections. 


Note that one may argue that it is as unlikely to obtain a very small chi-square value 
as a very big one. It is customary, however, to use only the upper tail in calculation of 
significance levels. A too small chi-square value is regarded as not a big problem. However, 
in such a case one should be somewhat critical since it indicates that one either is cheating, 
are using selected (biased) data or has (undeliberately) overestimated measurement errors 
(e.g. included systematic errors). 

To proceed in calculating the cumulative function we write 


1 — a = F(x a ) 



j-i 


e 2 dx 



*/2 


2 ; 2 1 e Z dz 


fn Xa\ 

V 2 ’ 2 ) 


where we have made the substitution z = x/2. From this we see that we may use the 
incomplete Gamma function P (see section 42.5) in evaluating probability contents but for 
historical reasons we have solved the problem by considering the cases with even and odd 
degrees of freedom separately as is shown in the next two subsections. 


Although we prefer exact routines to calculate the probability in each specific case a 
classical table may sometimes be useful. In table 1 on page 171 we show percentage points, 
i.e. points where the cumulative probability is 1 —a, for different degrees of freedom. 

It is sometimes of interest e.g. when rejecting a hypothesis using a chi-square test to 
scrutinize extremely small confidence levels. In table 2 on page 172 we show this for 
confidence levels down to 10~ 12 as chi-square values. In table 3 on page 173 we show the 
same thing in terms of chi-square over degrees of freedom ratios (reluctantly since we do 
not like such ratios). As discussed in section 8.2 we see, perhaps even more clearly, that 
for few degrees of freedom the ratios may be very high while for large number of degrees 
of freedom this is not the case for the same confidence level. 


41 




8.11 Even Number of Degrees of Freedom 

With even n the power of z in the last integral in the formula for F(x a ) above is an integer. 
From standard integral tables we find 


x m e ax dx = e ax J2(-l) r 


m lx 


| T m—r 


r =0 


(m — r)\a r+1 

where, in our case, a = — 1. Putting m = | — 1 and using this integral we obtain 

x a /2 


1 — a = 


z 2 1 e Z dz = 


r (i) 

m 

, _ X OL V-"V 

= 1 -e 2 E 


1 m m'z m -r 

_ P -z srt iw _ 

mV ^ ’ (m — r)!(—l) r+1 


Xql 

2 




— — 1 
o x 


r =0 


~{m — r)! 


_ x o: V-"V 

=l-e 2 E 




r =0 


2 r r! 


a result which indeed is identical to the formula for P(n, x) for integer n given on page 160. 


8.12 Odd Number of Degrees of Freedom 

In the case of odd number of degrees of freedom we make the substitution z 2 = x yielding 


1 — a = F(x a ) = 


2r (?) 


L ot n _i .. 

X \2 1 _x 1 

e 2 dx = 

.2 


2 f-‘r(|) 


/ b 2 ) 


n — 1 2 

2 — ” 


e 2 dz = 


2 } r(|) 

1 


V) 


-1 


e 2 2 zdz = 


z 2 m e~^dz 


o 

n —1 


2 m "ir (m + 


where we have put m = which for odd n is an integer. By partial integration in m 
steps 


z 2 m e~^dz = 


j z 2 m - l ze~^dz = -z 2m - x e + ( 2 m - 1 ) J z 2 m - 2 e~^dz 


2 e * 2 dz = 

-z 2m 3 e * 2 + ( 2 m • 

z 4 e~ ±2 dz = 

-,V4 + 3 J z 2 e~ 

2 

z 2 /* z 2 

z 2 e~ ±2 dz = 

— ze“A -f / e~^dz 


we obtain 


' 2m f -P- , v-^ 1 (2m — 1)!! 2r+1 

z e 2 = (2m - 1)!! / e 2 dz - E E--yryz + e 2 

J r = 0 + b" 


Applying this to our case gives 
1 


1 — a = 


2 m “ 2 r 


7 -^ (2m —1)!! [ e ~ 2 dz— 

(™+yi * 


m —1 


(2m 1 )H 2r+ i g 2 . 

Wo ( 2 ' r + 1 )!! 
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e 2 dz 


m— 1 | 

Lr ?o (2r+l)!! 


m —1 


= 2G(v^)-l- 


2»Tq _^. 

-e 2 ^ 


^ + 1 e-V 


x' 


7T 


,5, ( 2r + !)" 


where G(z) is the integral of the standard normal distribution from —oo to z. Here we 
have used T (m + |) = ^ 2r ^~ 1 ^ !! ypi r in order to simplify the coefficient. This result may be 
compared to the formula given on page 160 for the incomplete Gamma function when the 
first argument is a half-integer. 


8.13 Final Algorithm 

The final algorithm to evaluate the probability content from —oo to x for a chi-square 
distribution with n degrees of freedom is 

• For n even: 

o Put m — | — 1. 
o Set uo — 1, s — 0 and i — 0. 

o For i = 0,1 ,...,m set s = s + Ui,i = i + 1 and tq = Uj_i ■ 
o a = s ■ e~ i. 

• For n odd: 

o Put m = 

o Set u 0 = 1, s = 0 and i — 0. 

o For i = 0,1,..., m — 1 set s = s + Ui, i = i + 1 and tq = 
o a = 2 — 2G(y/x) + ■ s. 


8.14 Chi Distribution 


Sometimes, but less often, the chi distribution i.e. the distribution of y — \fx is used. By 
a simple change of variables this distribution is given by 


f(y) 


dx 

dy 


f(y 2 ) = ^y 


1 

2 





_2C 

e 2 



In figure 8 the chi distribution is shown for n-values of 1, 2, 5, and 10. The mode of the 
distribution is at \Jn — 1. 

The cumulative function for the chi distribution becomes 


F(y) 



x n 1 e 2 dx 


P 


(n y 2 

( 2>2 
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9 Compound Poisson Distribution 

9.1 Introduction 


The compound Poisson distribution describes the branching process for Poisson variables 
and is given by 


p(r;/i, A) = ^ 


(■ np) r e~ nfl X n e~ x 


n =0 


T\ 


n\ 


where the integer variable r > 0 and the parameters /i and A are positive real quantities. 


9.2 Branching Process 

The distribution describes the branching of n Poisson variables n, all with mean p where 
n is also distributed according to the Poisson distribution with mean A i. e. 

n p ni e~^ X n e~ x 

r = ^ rii with p(rii ) =-:— and p(n) = --— 


and thus 

OO 

p(r) = p(r\n)p(n) 

n =0 

Due to the so called addition theorem (see page 121) for Poisson variables with mean p the 
sum of n such variables are distributed as a Poisson variable with mean np and thus the 
distribution given above results. 


9.3 Moments 

The expectation value and variance of the Compound Poisson distribution are given by 

E(r ) = A p and V(r) = Xp(l + p) 
while higher moments gets slightly more complicated: 

h's = pX j/x + (p + 1) 2 | 

p4. = pX 1 j// ' T Qp T 7 p T 1 T 3/iA(l T p ) 

P5 = pX |/i 4 + 10/i 3 + 25/i 2 + 15p + 1 + 10pX(p + l)(p + (1 + h-) 2 )| 

Pq — pX |// r> T 15 p ^ 65p 3 T 90 p 3 T 31 p T 1 

+ 5/iA (5/i 4 + 33 p 3 + 61 p 2 + 36 p + 5) + l5p 2 X 2 (p + l) 3 } 

9.4 Probability Generating Function 

The probability generating function of the compound Poisson distribution is given by 

G(z) — exp {—A + Ae~^ +/i2 } 
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This is easily found by using the rules for branching processes where the probability 
generating function (p.g.f.) is given by 

G{z) = Gp(G P {z )) 

where Gp(z) is the p.g.f. for the Poisson distribution. 

9.5 Random Number Generation 

Using the basic definition we may proceed by first generate a random number n from a 
Poisson distribution with mean A and then another one with mean n/j,. 

For fixed fi and A it is, however, normally much faster to prepare a cumulative vector 
for values ranging from zero up to the point where computer precision gives unity and then 
use this vector for random number generation. Using a binary search technique one may 
allow for quite long vectors giving good precision without much loss in efficiency. 
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10 Double-Exponential Distribution 

10.1 Introduction 

The Double-exponential distribution is given by 

!(x A) = h e -Ai*-„i 

where the variable x is a real number as is the location parameter p while the parameter 
A is a real positive number. 

The distribution is sometimes called the Laplace distribution after the french astronomer, 
mathematician and physicist marquis Pierre Simon de Laplace (1749-1827). It is a sym¬ 
metric distribution whose tails fall off less sharply than the Gaussian distribution but faster 
than the Cauchy distribution. It has a cusp, discontinuous first derivative, at x — p. 

The distribution has an interesting feature inasmuch as the best estimator for the mean 
p is the median and not the sample mean. See further the discussion in section 7 on the 
Cauchy distribution where the Double-exponential distribution is discussed in some detail. 


10.2 Moments 


For the Double-exponential distribution central moments are more easy to determine than 
algebraic moments (the mean is p\ = p). They are given by 


l^n 


f (x- p) n f(x)dx = ^ | / (x - p) n e~ x ^~ x) + J(x - p) n e~ x{ 


-A (x-fi) l _ 



n\ 

2A" 


+ (-l) r 


n\ 

2A" 


i.e. odd moments vanish as they should due to the symmetry of the distribution and even 
moments are given by the simple relation p n = n\/\ n . From this one easily finds that the 
coefficient of skewness is zero and the coefficient of kurtosis 3. 

If required algebraic moments may be calculated from the central moments especially 
the lowest order algebraic moments become 


AT — AT AG 



6p o 

v + ' 1 ’ 


and 


AG 


24 12u 

v + ^ + ' 1 


but more generally 


AG 
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E 



/i r /i 


n—r 


10.3 Characteristic Function 

The characteristic function which generates central moments is given by 

^ - A2 

— ^2 _|_ -f. 
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from which we may find the characteristic function which generates algebraic moments 


<f> x (t) = E(e ltx ) = e lt »E(e lt{x -ri) 




A 2 

A 2 + 1 2 


Sometimes an alternative which generates the sequence /i[, H 2 , 1 ^ 3 ,... is given as 


4 >(t) = itfi + 


A 2 

A 2 + 1 2 


10.4 Cumulative Function 


The cumulative function, or distribution function, for the Double-exponential distribution 
is given by 


\e if x < /i 

1 — if x > fi 


From this we see not only the obvious that the median is at x — fi but also that the lower 
and upper quartile is located at ji =F In2/A. 


10.5 Random Number Generation 

Given a uniform random number between zero and one in £ a random number from a 
Double-exponential distribution is given by solving the equation F{x) = £ for x giving 

For £<| x — fi + ln(2£)/A 
for £ > | x — n — ln(2 — 2£)/A 
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11 


Doubly Non-Central F-Distribution 


11.1 Introduction 


If Xi and x 2 are independently distributed according to two non-central chi-square distribu¬ 
tions with 7ii and rt 2 degrees of freedom and non-central parameters Ai and A 2 , respectively, 
then the variable 

F , = xi/ni 

x 2 /n 2 

is said to have a doubly non-central F-distribution with ni,n 2 degrees of freedom (positive 
integers) and non-centrality parameters Ai, X 2 (both > 0). 

This distribution may be written as 


f(x] ni,n 2 , Ai, A 2 ) = — 


n i _ a 
e 2 
n 2 


EE 

r=0 s=0 


mm 


( n i x ) 
{ n 2 ) 




+ + r,f + s) 


T ! 


SI 


where we have put n = n i + n 2 and A = Ai + A 2 . For A 2 = 0 we obtain the (singly) 
non-central F-distribution (see section 32) and if also Ai = 0 we are back to the ordinary 
variance ratio, or F-, distribution (see section 16). 

With four parameters a variety of shapes are possible. As an example figure 9 shows the 
doubly non-central F-distribution for the case with rt\ = 10, n 2 = 5 and Ai = 10 varying 
A 2 from zero (an ordinary non-central F-distribution) to five. 



Figure 9: Examples of doubly non-central F-distributions 


11.2 Moments 


Algebraic moments of this distributions become 
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■E 

s=0 




+ s-k) 


The r-sum involved, with a polynomial in the numerator, is quite easily solvable giv¬ 
ing similar expressions as for the (singly) non-central F-distribution. The s-sums, how¬ 
ever, with polynomials in the denominator give rise to confluent hypergeometric functions 
M(a,b;x) (see appendix B). Lower order algebraic moments are given by 


E(x) 
E(x 2 ) 
E(x 3 ) 
E(x 4 ) 


n m + X / 
e 2 — . - M 

V 


n— 2 n . X 2 
2 ’ 2 ’ 2 


m n — 2 

( n \ 2 A 2 + (2A + m)(m + 2) 
(n — 2) (n — 4) 


cai¬ 
rn 


Af ( 


n—4 n . A 2 
2 ’ 2 ’ 2 


{ n\ 3 A 3 + 3(m + 4)A 2 + (3A + m)(m + 2)(m + 4) 

6 2 


m/ 


(n — 2) (n — 4) (n — 6) 


m( 


n—6 n. A 2 
2 ’ 2 ’ 2 


_^2 / n \ 4 A 4 + (m + 6) {4A 3 + (m + 4) [6A 2 + (4A + m)(m + 2)]} 


e 2 


m) 


•m( 


(n — 2)(n — 4)(n — 6)(n — 8) 


n— 8 n . A 2 
2 ’ 2 ’ 2 


11.3 Cumulative Distribution 

The cumulative, or distribution, function may be deduced by simple integration 

-+r— 1 


F(x) = 


n\ _ a 
— e 2 
n 2 


e_t EE 

r=0 s=0 
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11.4 Random Number Generation 

Random numbers from a doubly non-central F-distribution is easily obtained using the 
definition in terms of the ratio between two independent random numbers from non-central 
chi-square distributions. This ought to be sufficient for most applications but if needed more 
efficient techniques may easily be developed e.g. using more general techniques. 
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12 


Doubly Non-Central ^-Distribution 


12.1 Introduction 


If x and y are independent and x is normally distributed with mean S and unit variance 
while y is distributed according a non-central chi-square distribution with n degrees of 
freedom and non-centrality parameter A then the variable 


t = 



is said to have a doubly non-central t-distribution with n degrees of freedom (positive 
integer) and non-centrality parameters 6 and A (with A > 0). 

This distribution may be expressed as 


f{t] n, 5, A) 


e-’ie-i " (f)” 1 f, (tSy / | i 2 

r =0 r! r (f + r) s\ (f) 1 v n 


(=+1+1+,) 

p / n.+s+l 



For A = 0 we obtain the (singly) non-central ^-distribution (see section 33) and if also 5 = 0 
we are back to the ordinary t-distribution (see section 38). 

Examples of doubly non-central ^-distributions are shown in figure 9 for the case with 
n — 10 and 5 2 = 5 varying A from zero (an ordinary non-central ^-distribution) to ten. 



-2 0 2 4 6 

t 

Figure 10: Examples of doubly non-central ^-distributions 


12.2 Moments 

Algebraic moments may be deduced from the expression 
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where the sum should be taken for even values of s + k i.e. for even (odd) orders sum only 
over even (odd) s-values. 

Differing between moments of even and odd order the following expressions for lower 
order algebraic moments of the doubly non-central f-distribution may be expressed in terms 
of the confluent hypergeometric function M (a, b ; x) (see appendix B for details) as 


m 
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12.3 Cumulative Distribution 

The cumulative, distribution, function is given by 

(t)' 1 
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where q = (t 2 /n)/( 1 + t 2 /n) and si, S 2 are signs differing between cases with positive or 
negative t as well as odd or even s in the summation. More specific, the sign ,S| is —1 if s 
is odd and +1 if it is even while S 2 is +1 unless t < 0 and s is even in which case it is — 1. 


12.4 Random Number Generation 

Random numbers from a doubly non-central f-distribution is easily obtained with the def¬ 
inition given above using random numbers from a normal distribution and a non-central 
chi-square distribution. This ought to be sufficient for most applications but if needed more 
efficient techniques may easily be developed e.g. using more general techniques. 
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13 Error Function 

13.1 Introduction 

A function, related to the probability content of the normal distribution, which often is 
referred to is the error function 


erf z 



Z 


0 


e tZ dt 


and its complement 


erfc z 



1 — erf z 


These functions may be defined for complex arguments, for many relations concerning the 
error function see [27], but here we are mainly interested in the function for real positive 
values of z. However, sometimes one may still want to define the function values for negative 
real values of z using symmetry relations 


erf(-z) = -erf(z) 
erfc(-z) = 1 — erf(-z) = 1 + erf(z) 


13.2 Probability Density Function 

As is seen the error function erf is a distribution (or cumulative) function and the corre¬ 
sponding probability density function is given by 


/ (*) = 



If we make the transformation z — (x — n) / ay/2 we obtain a folded normal distribution 

/(x;/i,a) = - 
a 

where the function is defined for x > fi corresponding to z > 0, /i may be any real number 
while a > 0. 

This implies that erf{z/\J 2) is equal to the symmetric integral of a standard normal 
distribution between — z and z. 

The error function may also be expressed in terms of the incomplete Gamma function 

X 

erf x = ~^= [ dt = P 




defined for x > 0. 


53 


14 


Exponential Distribution 


14.1 Introduction 

The exponential distribution is given by 

f(x-,a) = —e _ “ 
a 

where the variable x as well as the parameter a is positive real quantities. 

The exponential distribution occur in many different connections such as the radioactive 
or particle decays or the time between events in a Poisson process where events happen at 
a constant rate. 


14.2 Cumulative Function 

The cumulative (distribution) function is 

X 

F(x) = j f(x)dx = 1 — e - “ 
o 

and it is thus straightforward to calculate the probability content in any given situation. 
E.g. we find that the median and the lower and upper quartiles are at 

M. = adn 2 ~ 0.693a, Q\ = — a In | ~ 0.288a, a nd <2 3 = a In4 ~ 1.386a 

14.3 Moments 

The expectation value, variance, and lowest order central moments are given by 

E{x) = a, V(x) = a 2 , p 3 = 2a 3 , p 4 = 9a 4 , 

/X 5 = 44a 5 , pe = 265a 6 , /ij = 1854a 7 , and Us = 14833a 8 
More generally algebraic moments are given by 

n' n = a n n\ 

Central moments thereby becomes 

n " (-l) m a n n\ /4 

H n — an:} -.->-= —- when n —> oo 

m - e e 

the approximation is, in fact, quite good already for n = 5 where the absolute error is 
0.146a 5 and the relative error 0.3%. 


14.4 Characteristic Function 

The characteristic function of the exponential distribution is given by 


<j>(t) = E(e ltx ) = 1 [ e {lt ~^ x dx = — 1 
a J 1 — 


ita 
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14.5 Random Number Generation 

The most common way to achieve random numbers from an exponential distribution is to 
use the inverse to the cumulative distribution such that 

x = F _1 (£) = —a ln(l - £) = ~ a ln£' 

where £ is a uniform random number between zero and one (aware not to include exactly 
zero in the range) and so is, of course, also £' = 1 — £. 

There are, however, alternatives some of which may be of some interest and useful if 
the penalty of using the logarithm would be big on any system [28] . 

14.5.1 Method by von Neumann 

The first of these is due to J. von Neumann [29] and is as follows (with different £’s denoting 
independent uniform random numbers between zero and one) 

i Set a = 0. 

ii Generate £ and put £ 0 = £• 

iii Generate £* and if £* < £ then go to vi. 

iv Generate £ and if £ < £* then go to iii. 

v Put a = a + 1 and go to ii. 

vi Put x = oi{(i T £o) as a random number from an exponential distribution. 

14.5.2 Method by Marsaglia 

The second technique is attributed to G. Marsaglia [30]. 

• Prepare 

_ n 1/11 1\ 
p„ = l-e a nd „ n = — +-J 

for n — 1,2 ,... until the largest representable fraction below one is exceeded in both 
vectors. 

i Put i — 0 and generate £. 

ii If £ > |_i put i — i + 1 and perform this step again. 

iii Put k = 1, generate £ and £*, and set £ min = £*. 

iv If £ < Qk then go to vi else set k = k + 1. 

v Generate a new £* and if £* < £ m i n set £ m ; n = £* and go to iv. 

vi Put x = a(i + £ m in) as an exponentially distributed random number. 
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14.5.3 Method by Ahrens 

The third method is due to J. H. Ahrens [28] 
• Prepare 


i 

ii 

iii 

iv 


Qn 


In 2 | (In 2) 2 | | (ln 2 ) n 

IT + 2! + "' + n \ 

for n — 1 , 2 ,... until the largest representable fraction less than one is exceeded. 


Put a = 0 and generate £. 

If £ < | set a — a + In 2 = a + qi, £ = 2£ and perform this step again. 

Set £ = 2£ — 1 and if £ < In 2 = q\ then exit with x = a (a + £) else put i — 2 and 
generate £ min . 

Generate £ and put £ min = £ if £ < £ min then if £ > qi put i — i + 1 and perform this 
step again else exit with x = a (a + gi£ m in)- 


Of these three methods the method by Ahrens is the fastest. This is much due to the 
fact that the average number of uniform random numbers consumed by the three methods 
is 1.69 for Ahrens, 3.58 for Marsaglia, and 4.31 for von Neumann. The method by Ahrens 
is often as fast as the direct logarithm method on many computers. 
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15 Extreme Value Distribution 


15.1 Introduction 


The extreme value distribution is given by 


f( X 'j /T a ) 


1 f X — U -r x — n, 

— exp =f -e + ^ 

cr L a 


where the upper sign is for the maximum and the lower sign for the minimum (often 
only the maximum is considered). The variable x and the parameter fi (the mode) are 
real numbers while cr is a positive real number. The distribution is sometimes referred 
to as the Fisher-Tippett distribution (type I), the log-Weibull distribution, or the Gumbel 
distribution after E. J. Gumbel (1891-1966). 

The extreme value distribution gives the limiting distribution for the largest or small¬ 
est elements of a set of independent observations from a distribution of exponential type 
(normal, gamma, exponential, etc.). 

A normalized form, useful to simplify calculations, is obtained by making the substitu¬ 
tion to the variable z = ±2—^ which has the distribution 


g{z) = e~ z ~ £ - z 

In figure 11 we show the distribution in this normalized form. The shape corresponds to 
the case for the maximum value while the distribution for the minimum value would be 
mirrored in z — 0. 



z 


Figure 11: The normalized Extreme Value Distribution 
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15.2 Cumulative Distribution 

The cumulative distribution for the extreme value distribution is given by 


X (j 

F(x) = J f(u)du = j g(z)dz = G(± - —) 

—oo — oo 

where G(z) is the cumulative function of g(z ) which is given by 

Z OO 

G(z) = j e~ u ~ e ~ u du = J e~ v dy = e~ e ~ z 

-oo e~ z 

where we have made the substitution y = e~ u in simplifying the integral. From this, and 
using x = /i ± az, we find the position of the median and the lower and upper quartile as 

M. = n q 1 a In In 2 « y, ± 0.367cr, 

Qi = /i T In hi 4 « /i q: 0.327a, and 

<2 3 = fi =F cr In In | « // ± 1.246 a 


15.3 Characteristic Function 

The characteristic function of the extreme value distribution is given by 

oo 

0 (f) = E ( e ltx ) = J e ltx — exp — — - e T ^ j dx = 

— OO 

OO / \ OO 

= =F J e %t G^ aXnz ) Ze ~ z ( =p°^7 j — J z ZfltcT e~ z dz = e rf/ T( 1 =p iter ) 

where we have made the substitution z = exp (=f(x — /i)/cr) i.e. x = q= a In z and thus 
dx = ^fcrdz/z to achieve an integral which could be expressed in terms of the Gamma 
function (see section 42.2). As a check we may calculate the first algebraic moment, the 
mean, by 

= 1 MX 1 ) + r ( 1 )V’(l)(T^)] = A* ± cry 

i 

Here 0(1) = —7 is the digamma function, see section 42.3, and 7 is Euler’s constant. 
Similarly higher moments could be obtained from the characteristic function or, perhaps 
even easier, we may find cumulants from the cumulant generating function In 0(f). In the 
section below, however, moments are determined by a more direct technique. 


, 1 d0(f) 

^ 1 dt 


15.4 Moments 

Algebraic moments for f(x) are given by 

OO OO 

E(x n ) = J x n f(x)dx = J (/i ± <jz) n g(z)dz 

— OO —OO 
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which are related to moments of g(z) 


CXJ CXJ 

E(z n ) = I z n e~ z ~ e ~ z dz = j (- In y) n e~ y dy 


The first six such integrals, for n values from 1 to 6 , are given by 


CXJ 

j (— In x)e~ x dx 
o 

OO 

J (— In x) 2 e~ x dx 
o 

OO 

J (— In x) 3 e~ x dx 
o 

OO 

J (— In x) A e~ x dx 
o 

OO 

J (— In x) 5 e~ x dx 
o 

OO 

J (— In x) 6 e~ x dx 


= 7 


7r 


7 + y 


7 3 + 7 ^ + 2 C 3 


3tt 4 
~20 

„3^2 q„,_4 


7 4 + 7 2 7T 2 + + 87(3 

7 s + 


5 5^tt 2 377T 4 2 , 10tt 2 C 3 


3 + ^ + 20 7 2 C 3 + + 24Cs 


57 4 7T^ 97^7T 4 6l7T b , 

7 6 + —-+ —-+-+ 407 3 C 3 + 

r 2 4 168 r S3 


+20 7 7r 2 C 3 + 40C 3 2 + 144 7 Cs 


corresponding to the six first algebraic moments of g(z). Here 7 is Euler’s (or Euler- 
Mascheroni) constant 

7 = lim f V ) - Inn | = 0.57721 56649 01532 86060 65120 ... 

h.i k J 

and ( n is a short hand notation for Riemann’s zeta-function C( n ) given by 

00 1 1 °r x z ~ l 

Qi z ) = E TT = pTV / ——zdx for z > 1 
t^i k V ( z ){ - 1 


(see also [31]). For z an even integer we may use 

2 2 n—i^nlD 


C(2n) = 


>2 n 


(2 n)\ 


for n — 1 , 2 , 


where B 2n are the Bernoulli numbers given by B 2 — R 4 = — B e = B$ = —^ 
etc (see table 4 on page 174 for an extensive table of the Bernoulli numbers). This implies 
C 2 = "g“ i C 4 = go 5 C6 = 945 ef c - 

For odd integer arguments no similar relation as for even integers exists but evaluating 
the sum of reciprocal powers the two numbers needed in the calculations above are given 
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by (3 = 1-20205 69031 59594 28540 ... and (5 = 1.03692 77551 43369 92633 .... The number ( 3 is 
sometimes referred to as Apery’s constant after the person who in 1979 showed that it is 
an irrational number (but sofar it is not known if it is also transcendental) [32], 

Using the algebraic moments of g(z) as given above we may find the low order central 
moments of g(z) as 


h 2 — 

h,3 = 

/i 4 = 

AC — 
AA 


y = C 2 
6 

2Cs 

3tt 4 /20 

10tt 2 C3 

3 

6l7T 6 


168 


+ 24(^5 

40C 3 2 


and thus the coefficients of skewness 71 and kurtosis 72 are given by 

71 = /VAC 2 = ISv^Cs/tt 3 ~ 1-13955 

72 = hV/U - 3 = 2.4 


Algebraic moments of f(x) may be found from this with some effort. Central moments are 
simpler being connected to those for g(z) through the relation p n (x) = (±1 ) n a n g n (z). 

I 11 particular the expectation value and the variance of f(x) are given by 

E(x) = /x ± aE(z) = /i ± cry 

V{x) = a 2 V(z) — 

6 

The coefficients of skewness (except for a sign ±1) and kurtosis are the same as for g(z). 

15.5 Random Number Generation 

Using the expression for the cumulative distribution we may use a random number £, 
uniformly distributed between zero and one, to obtain a random number from the extreme 
value distribution by 


G(z) — e e z — £ =>■ z = — ln(— ln^) 

which gives a random number from the normalized function g(z). A random number from 
fix) is then easily obtained by x = p ± az. 
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16 F-distribution 


16.1 Introduction 

The F-distribution is given by 

Ff - 1 mini Ff " 1 

,m ’ n _ r (f) r (f) K + n) 5 ? ~~ F (f, f) (■ mF + n 

where the parameters m and n are positive integers, degrees of freedom and the variable 
F is a positive real number. The functions T and B are the usual Gamma and Beta 
functions. The distribution is often called the Fisher F-distribution, after the famous british 
statistician Sir Ronald Aylmer Fisher (1890-1962), sometimes the Snedecor F-distribution 
and sometimes the Fisher-Snedecor F-distribution. In figure 12 we show the F-distribution 
for low values of m and n. 




Figure 12: The F-distribution (a) for m = 10 and n = 1,2,..., 10 and (b) for m = 
1 , 2,..., 10 and n — 10 

For m < 2 the distribution has its maximum at F = 0 and is monotonically decreasing. 
Otherwise the distribution has the mode at 

m — 2 n 

F mode * ' 77 

m n + 2 

This distribution is also known as the variance-ratio distribution since it, as will be 
shown below, describes the distribution of the ratio of the estimated variances from two 
independent samples from normal distributions with equal variance. 
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16.2 Relations to Other Distributions 

For 77i = 1 we obtain a ^-distribution, the distribution of the square of a variable distributed 
according to Student’s f-distribution. As n —> oc the quantity mF approaches a chi-square 
distribution with m degrees of freedom. 

For large values of m and n the F-distribution tends to a normal distribution. There are 
several approximations found in the literature all of which are better than a simpleminded 
standardized variable. One is 

+2 n - 1 — - y/2m — 1 

z 1 = - . n - 

Jl + ^ 

V n 

and an even better choice is 

flllzjHl zM 

\++■ & 

For large values of m and n also the distribution in the variable z = ln 2 F , the distri¬ 
bution of which is known as the Fisher ^-distribution, is approximately normal with mean 
^ (n — m) an d variance | + y)- This approximation is, however, not as good as Z 2 

above. 


16.3 1/F 

If F is distributed according to the F-distribution with m and n degrees of freedom then j= 
has the F-distribution with n and m degrees of freedom. This is easily verihed by a change 
of variables. Putting G = j, we have 


/(GO = 


dF 

f{F) = 

1 

m n 

7712 712 

m i 

(+ 

m n 

771 2 71 2 

G f- 1 

dG 

G 2 

jd( m n\ 

2 ’ 2 ) 

m+n 

(i+ 

jd( m n\ 

2 ’ 2 ) 

, m+n 

[m + nG) 2 


which is seen to be identical to a F-distribution with n and m degrees of freedom for G 


F ' 


16.4 Characteristic Function 

The characteristic function for the F-distribution may be expressed in terms of the confluent 
hypergeometric function M (see section 43.3) as 

0(f)=F(e^) = M(f,-f;-^f) 

16.5 Moments 

Algebraic moments are given by 

rrt n OO m _ -i , 

, 777 - 2 72 2 f r 2 

/A — n( m n \ / / j-i , \ m +n 

('mF + n ) 2 


. m 00 t— i Hk 1 + r 

( m \ 2 1 f F 2 1+r 

UJ (*£+ !)*** 
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( un\ 
m ) 


n 


m>D{ (« + l)¥n 

r (f+r) r (|-r) 

Ti f m\ Ti / n\ 

1 V 2 ) \ 2 / 


du — ( — 


and are defined for r < This may be written 


R(f + r,f-r) 

TD( m n\ 

2 ’ 2 / 


d'r = 


f(f + 1 )--- (f+ r-t) 


a form which may be more convenient for computations especially when m or n are large. 
A recursive formula to obtain the algebraic moments would thus be 


dr di — 1 


f+r 


— r 


starting with n' 0 = 1 . 

The first algebraic moment, the mean, becomes 


£(F) = 


n 


for n > 2 


n 


and the variance is given by 


Tr/ „. 2 n 2 (m + n — 2 ) 

1/ (F) = --- -77 -r for n > 4 

m(n — 2) 2 (n — 4) 


16.6 F-ratio 

Regard F = where u and v are two independent variables distributed according to the 
chi-square distribution with m and n degrees of freedom, respectively. 

The independence implies that the joint probability function in u and v is given by the 
product of the two chi-square distributions 


f(u,v,m,n) 


2T (f) ) { 2T (|) ) 


If we change variables to x — and y — v the distribution in x and y becomes 


f{x,y\m,n ) = 


d(u, v) 


d(x,y) 


f(u,v;m,n) 


The determinant of the Jacobian of the transformation is — and thus we have 

n 


f(x,y,m,n ) 
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Finally, since we are interested in the marginal distribution in x we integrate over y 


CXJ 

f{x\m,n) = j f{x,y\m,n)dy = 


m \ 2 — — 1 

— 12 


CXJ 

/ m+n _ -I y ( xm , -i \ 

7/2 e n +i ld7/ 


y ' "~ 2 *Fr(f)r(*)J v 

m m 

(f ) 2 2 ^r(^) (^) 2 xT- i 


which with x = F is the F-distribution with m and n degrees of freedom. Here we used 
the integral 

QO / . 

[ t z ~ 1 e~ at dt = — 

J a z 

o 

in simplifying the expression. 

16.7 Variance Ratio 

A practical example where the F-distribution is applicable is when estimates of the variance 
for two independent samples from normal distributions 


si = E 


[Xj - x ) 2 

m — 1 


= E 


{vi - yf 

n — 1 


have been made. In this case and s\ are so called normal theory estimates of o\ and a\ 
i.e. (m — 1 )s\/u\ and (n — l)s|/cr| are distributed according to the chi-square distribution 
with m — 1 and n — 1 degrees of freedom, respectively. 

In this case the quantity 


is distributed according to the F-distribution with m — 1 and n — 1 degrees of freedom. 
If the true variances of the two populations are indeed the same then the variance ratio 
s \/have the F-distribution. We may thus use this ratio to test the null hypothesis 
H 0 : af = o\ versus the alternative Hi : o\ ^ o\ using the F-distribution. We would reject 
the null hypotheses at the a confidence level if the F-ratio is less than F 1 _ Q ,/ 2 ,m-i,n-i or 
greater than F Q /2,m-i,n-i where F o m n is dehned by 


J f(F]m,n)dF=l — a 


i.e. a is the probability content of the distribution above the value F a:m _i tn _i. Note that the 
following relation between F-values corresponding to the same upper and lower confidence 
levels is valid 

F _ 1 

-^ 1 —a,m,n 
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16.8 Analysis of Variance 

As a simple example, which is often called analysis of variance , we regard n observations 
of a dependent variable x with overall mean x divided into k classes on an independent 
variable. The mean in each class is denoted Xj for j = 1, 2, k. In each of the k classes 
there are nj observations together adding up to n, the total number of observations. Below 
we denote by Xji the i:th observation in class j. 

Rewrite the total sum of squares of the deviations from the mean 


= 


k rij k ^ 

E J2i X ji -®) 2 = EE (fe ~ X j) + ( X 3 - x )f = 


3= 1 i =1 
k n 3 


j =1 i =1 


= E E [fo* ~ x if + ( x 3 - x ) 2 + 2 ( x ii ~ x i)( x i - x ) 

3 = 1 ir 1 

fc fc fc nj 

= E E( x j* - %) 2 + E E(®j - ^) 2 + 2 Efe - x ) J2( x ji - x o) 

j =1 Z=1 J = 1 Z=1 J = 1 Z=1 

/c k 

= EE( Xj) + E^fe-x) 2 SS w ithin T SS\jletween 


3 = 1 i=1 i =1 

he. the total sum of squares is the sum of the sum of squares within classes and the sum 
of squares between classes. Expressed in terms of variances 


nE (V 


k 

= E 

i=i 




a; 


k 

E 

a=i 


n. 


(xj — x) 2 


If the variable x is independent on the classification then the variance within groups and 
the variance between groups are both estimates of the same true variance. The quantity 


F 


S ^between/(f 1 ' 1) 

uithin / (jl ^ ) 


is then distributed according to the F-distribution with k — 1 and n—k degrees of freedom. 
This may then be used in order to test the hypothesis of no dependence. A too high F -value 
would be unlikely and thus we can choose a confidence level at which we would reject the 
hypothesis of no dependence of x on the classification. 

Sometimes one also defines rj 2 = SSb etween /SS X , the proportion of variance explained, 
as a measure of the strength of the effects of classes on the variable x. 


16.9 Calculation of Probability Content 

In order to set confidence levels for the F-distribution we need to evaluate the cumulative 
function i.e. the integral 

F a 

1 — a— J f(F-,m,n)dF 
o 
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where we have used the notation F n instead of F n m r for convenience. 


1 — a = 


m n r & 

m 2 n 2 r 


F m 

2 


-1 


dF = 


(”) 


2 F a 


F 2 


-1 


(■mF + n 5(f,f ){ + 

2Z 1 mF a m - ™ s 1 - 

(”)' T fe ) 1 


m+n 

2 


-dF = 


m mFfy 

2 n i Uj 1 1 / i — 

n 1 

— nil = - 

R/m n\ I / . m+n ryfm n\ I , \ g+j 

B \~2 1 2 / { 0 + 1 ) 2 ™ ^\~2 i 2 ' 1 (1 + U ) 2 


U 2 


du 


0 


where we made the substitution u = —. The last integral we recognize as the incomplete 


Beta function B x defined for 0 < x < 1 as 


B x (p,q) = j t p X (1 - t) 9 x dt 
o 


r u p 
7(1 + w) p+9 


du 


where we made the substitution u = Or he. t = 0-. We thus obtain 

1— t 1+lL 


1 — a = 


TD (m n\ 

2 ? 27 

TD( m n \ 
2 ’ 2 / 


4 


m 7i 

Y’ 2 


with t~ = i.e. x = mF % . The variable x thus has a Beta distribution. Note that 

1—® n n+mr a 

also I x (a,b) is called the incomplete Beta function (for historical reasons discussed below 
but see also section 42.7). 


16.9.1 The Incomplete Beta function 

In order to evaluate the incomplete Beta function we may use the serial expansion 


B x (p,q) = x p 


1 1 
—I— 

p p + 1 


q x + (l-g)(2-g) x2 + _ _ + (l-g)(2-g)---(n-g) ^ n + . 


2!(p + 2) 


n\(p + n) 


For integer values of q corresponding to even values of n the sum may be stopped at 
n = q — 1 since all remaining terms will be identical to zero in this case. 

We may express the sum with successive terms expressed recursively in the previous 
term 


B x (p, q) = x p ^2 t r 'with t r = t r _\ 


r =0 


x{r — q){jp + r — 1) 
r(p + r) 


starting with t 0 = - 
P 


The sum normally converges quite fast but beware that e.g. for p = q = | (rn — 
n = 1) the convergence is very slow. Also some cases with q very big but p small seem 
pathological since in these cases big terms with alternate signs cancel each other causing 
roundoff problems. It seems preferable to keep q < p to assure faster convergence. This 
may be done by using the relation 


B x (p,q ) = B^q.p) - Bi_ x (q,p) 
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which if inserted in the formula for 1 — a gives 


1 — a = 




n m\ _ td (n m \ 

2 ? 2 / jd 1-x\2 ? 2 / 

td( m n\ 

2 ’ 2 / 


TD f m n\ 1 — r> 

B VY’2) 2 2 


since Bi(p,q) = B(p,q) = B(q,p). 

A numerically better way to evaluate the incomplete Beta function I x (a,b ) is by the 
continued fraction formula [10] 


x“(l — x) b 1 d\ d 2 
aB(a, 6) lA 1A 1A 


Here 


d 


2m+l 


(a A m) (a A b A m)x 
(a A 2m) (a A 2m A 1) 


and d 2m — 


m{b — m)x 


(a A 2m — l)(a A 2m) 


and the formula converges rapidly for x < (a A 1)/(a A & A 1). For other x -values the same 
formula may be used after applying the symmetry relation 


I x (a,b) = 1 - Ii_ x (b, a) 


16.9.2 Final Formulae 

Using the serial expression for B x given in the previous subsection the probability content 
of the F-distribution may be calculated. The numerical situation is, however, not ideal. For 
integer a- or Avalues 3 the following relation to the binomial distribution valid for integer 
values of a is useful 

1 - I x (a, b ) = h- x (b, a) = J2 ( a + b ^ x\l - x) a+b ~ l ~ l 

i =0 V 1 J 

Our final formulae are taken from [26], using x = n + mF (note that this is one minus our 
previous definition of x), 


Even m: 


n, , n(n A 1), 

a = X 2 ■ 1 A — (1 — x) A 2 4 (l-x) A 

n(n A 2)... (m A n — 4) , m -2 

+ 2 ■ 4... (m — 2) (1 ~ X) ’ 


Even n: 


1 — a = 1 — (1 — x) 


771 m(m A 2) 9 

1 + -* + ^-^ +... 


m(m + 2)... (m + n — 4) n -2 

H------- -x 2 

2 • 4 ... (n - 2) 


3 If only b is an integer use the relation I x (a , 6) = 1 — a). 
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• Odd m and ri: 


1 — a = 1 — A + (3 with 
2 


A = 


7r 


6 + sin 6 ( cos 9 + - cos 3 9 + ... 


2 • 4 ... (n — 3) n-2 q 

''' ' 1 • 3 ... (n — 2) C ° S , 


P = -r= 


2 r 

71 


n+1 

2 


r (5) 


• sin 6* • cos n 6 ■ 


1 + 


for n > 1 and 
n+1 . 2 


sin 6 + ... 


... + 


(n + 1) (n + 3)... (m + n — 4) 


9 = arctan - 


InF 


m 


3 • 5 ... (n - 2) 

r (^±4) 

and ^ ^ 


sm 


m—3 


for m > 1 where 


(n — 1)! 


r (f) (n — 2)!! 0F 

If 77 , = 1 then A = 29/ft and if m = 1 then (3 = 0. 

For large values of m and n we use an approximation using the standard normal 
distribution where 

Fi - &) - (l - t) 


z = 


9m n 9n 


is approximately distributed according to the standard normal distribution. Confi¬ 
dence levels are obtained by 


_ oo 

1 r J 2 . 


a = 




e 2 dx 


In table 5 on page 175 we show some percentage points for the F-distribution. Here n 
is the degrees of freedom of the greater mean square and m the degrees of freedom for the 
lesser mean square. The values express the values of F which would be exceeded by pure 
chance in 10%, 5% and 1% of the cases, respectively. 

16.10 Random Number Generation 

Following the definition the quantity 


Vm/m 

Vn/n 


where y n and y m are two variables distributed according to the chi-square distribution with 
n and mx degrees of freedom respectively follows the F-distribution. We may thus use this 
relation inserting random numbers from chi-square distributions (see section 8.7). 

















17 Gamma Distribution 


17.1 Introduction 

The Gamma distribution is given by 

f(x;a,b ) = a(ax) fe_1 e _ax /r(6) 

where the parameters a and b are positive real quantities as is the variable x. Note that 
the parameter a is simply a scale factor. 

For b < 1 the distribution is J-shaped and for b > 1 it is unimodal with its maximum 
at x = —. 

a 

In the special case where b is a positive integer this distribution is often referred to as 
the Erlangian distribution. 

For b = 1 we obtain the exponential distribution and with a = \ and b = | with n an 
integer we obtain the chi-squared distribution with n degrees of freedom. 

In figure 13 we show the Gamma distribution for b -values of 2 and 5. 



0 5 10 

x/a 


Figure 13: Examples of Gamma distributions 


17.2 Derivation of the Gamma Distribution 


For integer values of b, i.e. for Erlangian distributions, we may derive the Gamma distri¬ 
bution from the Poisson assumptions. For a Poisson process where events happen at a rate 
of A the number of events in a time interval t is given by Poisson distribution 


P(r) 


(A t) r e~ M 


69 




The probability that the &:th event occur at time t is then given by 


k-l k -1 (\f\r p - xt 

E nr) = e { } 

r =0 r=0 


T\ 


i.e. the probability that there are at least k events in the time t is given by 


00 k ~ l (XfYe~ xt 

m = e nr) = 1 - e ( j 

r=k r =0 


A t 




T ! 


(At - 1)! 


dz = 


\ k z k l e Xz 


dz 


where the sum has been replaced by an integral (no proof given here) and the substitution 
z = Xz made at the end. This is the cumulative Gamma distribution with a = X and b = k, 
i.e. the time distribution for the /c:th event follows a Gamma distribution. In particular 
we may note that the time distribution for the occurrence of the first event follows an 
exponential distribution. 

The Erlangian distribution thus describes the time distribution for exponentially dis¬ 
tributed events occurring in a series. For exponential processes in parallel the appropriate 
distribution is the hyperexponential distribution. 


17.3 Moments 

The distribution has expectation value, variance, third and fourth central moments given 
by 

_.. b . . b 2b 36(2 + 6) 

E(x) = ~, V(x) = — , n 3 = —, and + 4 =- 7 - 

a a z a 6 a 4 

The coefficients of skewness and kurtosis is given by 

2 A 6 

7l = ~rf an “ 72 = V 

V6 6 


More generally algebraic moments are given by 

i b 

T(6) 


OO ^ OO 

n = f X n f(x)dx = J x n+b ^e~ ax dx = 


0 ' ' 0 

OO 


a f fy\ n+b ~ l „-ydy _ T(n + 6) 


a f (y\ 
T(6) J \aj 


a a'T(6) 


6(6 + 1) • • • (6 + n — 1) 


where we have made the substitution y = ax in simplifying the integral. 


17.4 Characteristic Function 

The characteristic function is 


b 00 

<j>(t) = E(e ltx ) = n~~ [ x^e-^-^dx = 
r (6) J 


T(6) (a — it) 


n J yt " e " dy= 0 - T) 


-b 
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where we made the transformation y = x(a — it) in evaluating the integral. 


17.5 Probability Content 


In order to calculate the probability content for a Gamma distribution we need the cumu¬ 
lative (or distribution) function 


F(x) 




j u b ~ 1 e~ au du = 
o 


6-1 


, dv 


1 

m 


j v b ~ l e~ v dv 
o 


7 ( 6 , ax) 


where 7 (b,ax) denotes the incomplete gamma function 4 . 


17.6 Random Number Generation 

17.6.1 Erlangian distribution 

In the case of an Erlangian distribution (b a positive integer) we obtain a random number 
by adding b independent random numbers from an exponential distribution 1 . e. 

x = — ln(£i • 6 ’ ■ ■ ■ • 6 )/a 

where all the 6 are uniform random numbers in the interval from zero to one. Note that 
care must be taken if b is large in which case the product of uniform random numbers may 
become zero due to machine precision. In such cases simply divide the product in pieces 
and add the logarithms afterwards. 

17.6.2 General case 

In a more general case we use the so called Johnk’s algorithm 

i Denote the integer part of b with % and the fractional part with / and put r = 0. Let 
£ denote uniform random numbers in the interval from zero to one. 


ii 

If i > 0 then put r = 

= - ln(£i • £2 • • • • • 

iii 

If / = 0 then go to 

vii. 

iv 

Calculate w\ = 

and w 2 = 6 + 2 1 ~ /) 

V 

If tui + W 2 > 1 then 

go back to iv. 

vi 

Put r = r - 111 ( 6 + 3 ; 

| , W 1 

W1 + W2 ' 

vii 

Quit with r = r/a. 



4 When integrated from zero to x the incomplete gamma function is often denoted by 7(0, x) while for the 
complement, integrated from x to infinity, it is denoted T(a, x). Sometimes the ratio P(a, x) = 7(a, x)/Y (a) 
is called the incomplete Gamma function. 
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17.6.3 Asymptotic Approximation 

For b big, say b > 15, we may use the Wilson-Hilferty approximation: 

i Calculate q — 1 + ^ + where z is a random number from a standard normal 
distribution. 

ii Calculate r — b ■ q 3 . 

iii If r < 0 then go back to i. 

iv Quit with r = r/a. 
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18 Generalized Gamma Distribution 


18.1 Introduction 

The Gamma distribution is often used to describe variables bounded on one side. An even 
more flexible version of this distribution is obtained by adding a third parameter giving 
the so called generalized Gamma distribution 

f(x-,a,b,c ) = ac(ax) bc ~ 1 e- < ' ax ' >c /T(b) 

where a (a scale parameter) and b are the same real positive parameters as is used for the 
Gamma distribution but a third parameter c has been added (c = 1 for the ordinary Gamma 
distribution). This new parameter may in principle take any real value but normally we 
consider the case where c > 0 or even c > 1. Put |c| in the normalization for f(x) if c < 0. 

According to Hegyi [33] this density function first appeared in 1925 when L. Amoroso 
used it in analyzing the distribution of economic income. Later it has been used to describe 
the sizes of grains produced in comminution and drop size distributions in sprays etc. 

In figure 14 we show the generalized Gamma distribution for different values of c for 
the case a = 1 and b = 2. 



Figure 14: Examples of generalized Gamma distributions 


18.2 Cumulative Function 


The cumulative function is given by 


f 7 ( b , ( ax ) c ) /T(6) = P ( b , ( ax ) c ) if c > 0 

l T (6, (ax) c ) /T(6) = 1 - P (b, (ax) c ) if c < 0 


where P is the incomplete Gamma function. 
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18.3 Moments 

Algebraic moments are given by 


, _ i r (6 + 5) 

^ a” ' r(f>) 

For negative values of c the moments are finite for ranks n satisfying n/c > —b (or even 
just avoiding the singularities ^ ^ 0, — 1 , —2 . . .). 

18.4 Relation to Other Distributions 

The generalized Gamma distribution is a general form which for certain parameter com¬ 
binations gives many other distributions as special cases. In the table below we indicate 
some such relations. For notations see the corresponding section. 


Distribution 

a 

b 

C 

Section 

Generalized gamma 

a 

b 

C 

18 

Gamma 

a 

b 

1 

17 

Chi-squared 

1 

2 

n 

2 

1 

8 

Exponential 

J_ 

a 

l 

1 

14 

Weibull 

1 

a 

l 

V 

41 

Rayleigh 

i 

aV2 

l 

2 

37 

Maxwell 

1 

ay /2 

3 

2 

2 

25 

Standard normal (folded) 

1 

V2 

1 

2 

2 

34 


In reference [33], where this distribution is used in the description of multiplicity distri¬ 
butions in high energy particle collisions, more examples on special cases as well as more 
details regarding the distribution are given. 
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19 


Geometric Distribution 


19.1 Introduction 

The geometric distribution is given by 

p(r\ p) = p{ 1 — p) r ~ l 

where the integer variable r > 1 and the parameter 0 < p < 1 (no need to include limits 
since this give trivial special cases). It expresses the probability of having to wait exactly 
r trials before the first successful event if the probability of a success in a single trial is p 
(probability of failure q = 1 — p). It is a special case of the negative binomial distribution 
(with k — 1). 


19.2 Moments 

The expectation value, variance, third and fourth moment are given by 

P 


E(r) = 1 V{r) = — 

p p z 


(1 - p) (2 - p) _ (1 - p) (p 2 - 9p + 9) 

p3 — -- P'4 — - 


pa 


P 


The coefficients of skewness and kurtosis is thus 

2 — p 


7i = 


V 1 ~P 


and 72 = 


p 2 — 6p + 6 
1 — p 


19.3 Probability Generating Function 

The probability generating function is 

OO 

G( 2 ) = B(V) = E^(l-J>r‘ 

r— 1 


pz 

1 — qz 


19.4 Random Number Generation 


The cumulative distribution may be written 

k 

P{k) = y^p(r) = 1 — q k with q = 1 — p 

r =1 

which can be used in order to obtain a random number from a geometric distribution by 
generating uniform random numbers between zero and one until such a number (the /c:th) 
is above q k . 

A more straightforward technique is to generate uniform random numbers £* until we 
find a success where £& A P- 

These two methods are both very inefficient for low values of p. However, the first 
technique may be solved explicitly 


£ p M = ( 


r= 1 


k = 


in ^ 

In q 
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which implies taking the largest integer less than k+1 as a random number from a geometric 
distribution. This method is quite independent of the value of p and we found [14] that 
a reasonable breakpoint below which to use this technique is p — 0.07 and use the first 
method mentioned above this limit. With such a method we do not gain by creating a 
cumulative vector for the random number generation as we do for many other discrete 
distributions. 
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20 Hyperexponential Distribution 

20.1 Introduction 

The hyperexponential distribution describes exponential processes in parallel and is given 
by 

Ai, A 2 ) = p\ie~ Xlx + qX 2 e~ X2X 

where the variable x and the parameters Ai and X 2 are positive real quantities and 0 < p < 1 
is the proportion for the first process and q = 1 — p the proportion of the second. 

The distribution describes the time between events in a process where the events are 
generated from two independent exponential distributions. For exponential processes in 
series we obtain the Erlangian distribution (a special case of the Gamma distribution). 

The hyperexponential distribution is easily generalized to the case with k exponential 
processes in parallel 

/ 0 ) = J2pi x * e ~ XiX 

i =1 

where A* is the slope and p t the proportion for each process (with the constraint that 
EPi = !)• 

The cumulative (distribution) function is 

F(x) =p(l- e~ XlX ) + q (l - e~ X2X ) 

and it is thus straightforward to calculate the probability content in any given situation. 

20.2 Moments 

Algebraic moments are given by 


p'n = n\ 



q_ 

K 


Central moments becomes somewhat complicated but the second central moment, the 
variance of the distribution, is given by 


P-2 = V ( X) 


P , Q , 




2 


20.3 Characteristic Function 


The characteristic function of the hyperexponential distribution is given by 


</>(£) 


V , g 

_ vt_ ' 1 vt_ 

Ai 1 A 2 
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20.4 Random Number Generation 

Generating two uniform random numbers between zero and one, £1 and we obtain a 
random number from a hyperexponential distribution by 

• If £1 < P then put x = — 1^. 

• If £1 > p then put x = — 

i.e. using G we choose which of the two processes to use and with £ 2 we generate an 
exponential random number for this process. The same technique is easily generalized to 
the case with k processes. 
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21 Hypergeometric Distribution 


21.1 Introduction 

The Hypergeometric distribution is given by 

(M\(N-M 

p(r; n, N, M) = 

where the discrete variable r has limits from max(0,n — N + M) to min(n, M) (inclusive). 
The parameters n (1 < n < N), N (N > 1) and M (.M > 1) are all integers. 

This distribution describes the experiment where elements are picked at random without 
replacement. More precisely, suppose that we have N elements out of which M has a certain 
attribute (and N — M has not). If we pick n elements at random without replacement p(r) 
is the probability that exactly r of the selected elements come from the group with the 
attribute. 

If N n this distribution approaches a binomial distribution with p — ^. 

If instead of two groups there are k groups with different attributes the generalized 
hypergeometric distribution 




where, as before, N is the total number of elements, n the number of elements picked and 
M a vector with the number of elements of each attribute (whose sum should equal N). 
Here n — )T r t and the limits for each ry is given by max(0, n — N+Mk ) < < min(n, M^). 



21.2 Probability Generating Function 


The Hypergeometric distribution is closely related to the hypergeometric function, see 
appendix B on page 167, and the probability generating function is given by 


G{z) 



2 Fi(—n, — M; N — M—n + 1; z ) 


21.3 Moments 


With the notation p = anc } q = \ —p^ i. e , the proportions of elements with and without 
the attribute, the expectation value, variance, third and fourth central moments are given 
by 


E(r) 

= np 


N - 

V(r) 

- npq N 

h.3 

= npq(q — 

V 4 

= npq(N - 


(N — n)(N — 2n) 

(N-l)(N-2) 

N(N + 1) — 6n(N — n) + 3pq(N 2 (n — 2) — Nn 2 + 6 n(N — n )) 
^ (N - 1)(N - 2)(N - 3) 
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For the generalized hypergeometric distribution using pi = Adi/N and g* = 1 — we 
hnd moments of jq using the formulae above regarding the group i as having an attribute 
and all other groups as not having the attribute, the covariances are given by 


Cov{r h r,) 


N — n 

n P*PjJT=i 


21.4 Random Number Generation 

To generate random numbers from a hypergeometric distribution one may construct a 
routine which follow the recipe above by picking elements at random. The same technique 
may be applied for the generalized hypergeometric distribution. Such techniques may be 
sufficient for many purposes but become quite slow. 

For the hypergeometric distribution a better choice is to construct the cumulative func¬ 
tion by adding up the individual probabilities using the recursive formula 


p ( r ) 


(M — r + 1 ) (n - r + 1 ) , 1 . 

r(N -M-n + r) P ^ ~ ^ 


for the appropriate r-range (see above) starting with p(r m i n ). With the cumulative vector 
and one single uniform random number one may easily make a fast algorithm in order to 
obtain the required random number. 
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22 Logarithmic Distribution 

22.1 Introduction 


The logarithmic distribution is given by 


P(v,p) 


(i -pY 

r In p 


where the variable r > 1 is an integer and the parameter 0 < p < 1 is a real quantity. 

It is a limiting form of the negative binomial distribution when the zero class has been 
omitted and the parameter k —> 0 (see section 29.4.3). 


22.2 Moments 


The expectation value and variance are given by 


E(r) = -^ a nd V(r) = ± aq) 

P p Z 

where we have introduced q = 1 — p and a = 1/ hip for convenience. The third and fourth 
central moments are given by 


Ih = 

aq 

p 4 = 

pi 

aq 

T 


p 4 


More generally factorial moments are easily found using the probability generating 
function 

,-jk 

E(r(r - 1) • • • (r - k + 1)) = jzj:G(z 

From these moments ordinary algebraic and central moments may be found by straightfor¬ 
ward but somewhat tedious algebra. 


= —(n- l)!a — 


2=1 


P* 


22.3 Probability Generating Function 

The probability generating function is given by 


G(z) 


E(z r ) 


y Z r (l-p) r 

fyf) r hip 


where q = 1 — p and since 


~ (zqf _ ln(l — zq) 
In p“ r b l (1 - q) 


ln(l — x) 



for — 1 < x < 1 
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22.4 Random Number Generation 


The most straightforward way to obtain random numbers from a logarithmic distribution 
is to use the cumulative technique. If p is fixed the most efficient way is to prepare a 
cumulative vector starting with p(l) = — aq and subsequent elements by the recursive 
formula p(i) = p{i — 1 )q/i. The cumulative vector may, however, become very long for 
small values of p. Ideally it should extend until the cumulative vector element is exactly 
one due to computer precision. It p is not fixed the same procedure has to be made at each 
generation. 
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23 Logistic Distribution 

23.1 Introduction 

The Logistic distribution is given by 


f(x] a, k) 


k(l + e z ) 2 


x — a 

with z = —-— 
k 


where the variable a; is a real quantity, the parameter a a real location parameter (the 
mode, median, and mean) and k a positive real scale parameter (related to the standard 
deviation). In figure 15 the logistic distribution with parameters a = 0 and k = 1 (he. z = x ) 
is shown. 



X 


Figure 15: Graph of logistic distribution for a = 0 and k = 1 


23.2 Cumulative Distribution 

The distribution function is given by 

. . 1 1 1 

F{x ) = 1 - -= -= - — 

w 1 + e z l + e~ z i + e -V 

The inverse function is found by solving F(x) —a giving 

x = F _1 (a) — a — kin ^ 

from which we may find e.g. the median as M. = a. Similarly the lower and upper quartiles 
are given by Q i j2 — a=pk In 3. 
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23.3 Characteristic Function 

The characteristic function is given by 


CXJ 

<j>(t) = E{e ltx ) = f e ltx - 

J 


x — a 

e k 


—oo 
,itk n 


k (l + 


r. — a \ 2 


e * 


dx = e 


ltd 


gitzk^z 


k( 1 + e z )' 2 


kdz = 


= e 


ltd 


V ' V ■ — = e lta B(l+itk, l—itk) — 


= e 


ita 


{ ( l + v ) 2 v 

T(l+itk)T(l — xtk) 

f(2) 


= e lta itkT{itk)T(l — itk) = e 


ltd 


itk'K 
sin mtk 


where we have used the transformations z = (x—a)/k and y = e z in simplifying the integral, 
at the end identifying the beta function, and using relation of this in terms of Gamma 
functions and their properties (see appendix A in section 42). 


23.4 Moments 

The characteristic function is slightly awkward to use in determining the algebraic moments 
by taking partial derivatives in t. However, using 

huf)(t) = ita + lnT(l+A/c) + lnT(l— itk) 

we may determine the cumulants of the distributions. In the process we take derivatives of 
In cj)(t) which involves polygamma functions (see section 42.4) but all of them with argument 
1 when inserting t — 0 a case which may be explicitly written in terms of Riemann’s zeta- 
functions with even real argument (see page 59). It is quite easily found that all cumulants 
of odd order except K\ — a vanish and that for even orders 

o2n—l,_2nI d I 

K 2n = 2k 2n ^ 2n - 1 \l) = 2(2 n - l)R 2 "C(2n) = 2(2n - 1 )\k 2n --— 

(2 n)\ 

for n — 1, 2,... and where B 2n are the Bernoulli numbers (see table 4 on page 174). 

Using this formula lower order moments and the coefficients of skewness and kurtosis 
is found to be 


/4 = E{x) — Ki — a 

H 2 = V(x) = k 2 = k 2 n 2 /3 

H 3 = 0 

2 2 fc 4 7 T 4 k 4 ir 4 7 k 4 n 4 

N - K t +3K- 2 -— + — - — 

h 5 = 0 

h6 = + 15^4^2 + 10 k 2 + 15^2 = 

16fc 6 7T 6 2fc 6 7T 6 15fc 6 7T 6 31fc 6 7T 6 

63 + 3 + 27 “ 11 

71 = 0 

72 = 1.2 (exact) 
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23.5 Random numbers 

Using the inverse cumulative function one easily obtains a random number from a logistic 
distribution by 

x — a + k In 

with £ a uniform random number between zero and one (limits not included). 
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24 


Log-normal Distribution 


24.1 Introduction 


The log-normal distribution or is given by 




-._e 2 1 * j 


where the variable x > 0 and the parameters y and a > 0 all are real numbers. It is 
sometimes denoted A(/r, a 2 ) in the same spirit as we often denote a normally distributed 
variable by iV(/r,cr 2 ). 

If u is distributed as a 2 ) and u = In x then x is distributed according to the 
log-normal distribution. 

Note also that if x has the distribution A(/i, a 2 ) then y = e a x b is distributed as A(a + 

6 /i, b 2 a 2 ). 

In figure 16 we show the log-normal distribution for the basic form, with y = 0 and 
a — 1 . 



x 


Figure 16: Log-normal distribution 

The log-normal distribution is sometimes used as a first approximation to the Landau 
distribution describing the energy loss by ionization of a heavy charged particle (c/also the 
Moyal distribution in section 26). 

24.2 Moments 

The expectation value and the variance of the distribution are given by 

E(x) = and V(x) = e 2 ^ 2 (f 2 - l) 
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and the coefficients of skewness and knrtosis becomes 


71 = \je &2 — 1 ( 'e a2 + 2 ) and 72 = (e^ 2 — l) (e 3fj2 + 3 e 2<T " + 6e ff2 + 6) 
More generally algebraic moments of the log-normal distribution are given by 



— OO 


where we have used the transformation y = In a: in simplifying the integral. 


24.3 Cumulative Distribution 

The cumulative distribution, or distribution function, for the log-normal distribution is 
given by 


F(x) 



In t—u \ 2 

) dt 


l ± l p 

2 2 



\nx 



where we have put z — (In x — n)/<J and the positive sign is valid for z > 0 and the negative 
sign for z < 0. 


24.4 Random Number Generation 

The most straightforward way of achieving random numbers from a log-normal distribution 
is to generate a random number u from a normal distribution with mean y and standard 
deviation a and construct r = e u . 
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25 


Maxwell Distribution 


25.1 Introduction 

The Maxwell distribution is given by 

f(x]a) = At 
cr 

where the variable x with x > 0 and the parameter a with a > 0 are real quantities. It is 
named after the famous Scottish physicist James Clerk Maxwell (1831-1879). 

The parameter a is simply a scale factor and the variable y = x/a has the simplified 
distribution 

g(y) = 





Figure 17: The Maxwell distribution 

The distribution, shown in figure 17, has a mode at x = a and is positively skewed. 

25.2 Moments 

Algebraic moments are given by 

°r 1 [E °r 

E(x n ) = J x n f{x)dx= —J- j \x\ n+2 e~ x2/2a2 

0 —oo 

i.e. we have a connection to the absolute moments of the Gauss distribution. Using these 
(see section on the normal distribution) the result is 

t?( n \ I \/^-2 k k\oi 2k ~ l for n = 2k — 1 
( {n + 1 )\\a n for n even 




Specifically we note that the expectation value, variance, and the third and fourth 
central moments are given by 


E(x) = 2ay—, V(x) = a 2 ^ 


3-) , h3 — 

7r 


, /16 \ [2 4 / 8\ 
2a I-5 ) y —, and — a 115-) 


The coefficients of skewness and kurtosis is thus 


7i = 


( 3 P) 


0.48569 and y 2 = 


15 — - 

_7T_ 

( 3= §) ; 


0.10818 


25.3 Cumulative Distribution 

The cumulative distribution, or the distribution function, is given by 


x 2 

2a 2 


F(x) = / f{y)dy = - / y 2 e ^dy = -= ^fze Z dz = 


7 


a 3 V tt 


7r 


(!■£) 

r (l) 


= P r 


3 x 2 


2’ 2a 2 


where we have made the substitution z = ^2 in order to simplify the integration. Here 
P(a,x ) is the incomplete Gamma function. 

Using the above relation we may estimate the median A4 and the lower and upper 
quartile, Q\ and Q3, as 


Qi = ayP _1 (|, |) ~ 1.10115 a 
M = a\J P _1 (!, \) & 1.53817 a 
Q 3 = a^/ P _1 (|, |) ss 2.02691 a 

where P~ 1 (a,p) denotes the inverse of the incomplete Gamma function i.e. the value x for 
which P(a, x ) = p. 


25.4 Kinetic Theory 

The following is taken from kinetic theory, see e.g. [34]. Let v = (v x , v y , v z ) be the velocity 
vector of a particle where each component is distributed independently according to normal 
distributions with zero mean and the same variance a 2 . 

First construct 

1,2 v l v y v l 

w = — =-1—- H- 

a 2 a 2 a 2 a 2 

Since v x /a, v y /a, and v z /a are distributed as standard normal variables the sum of their 
squares has the chi-squared distribution with 3 degrees of freedom i.e. g(w) = 
which leads to 


f(v) = g(w ) 


dw 


dv 


= 9 


V\ 2v 


a 2 a 2 


2 2 _ 

—v e 2 ^ 


V tt 


which we recognize as a Maxwell distribution with a = a. 












In kinetic theory a = kT/m, where k is Boltzmann’s constant, T the temperature, and 
m the mass of the particles, and we thus have 


f(v) 


2 m 3 
nk 3 T 3 


v 2 e 


mv^ 

2 kT 


The distribution in kinetic energy E = mv 2 /2 becomes 


9(E) 


AE 

nk 3 T 3 


E 

kT 


which is a Gamma distribution with parameters a = 1 /kT and b = |. 


25.5 Random Number Generation 

To obtain random numbers from the Maxwell distribution we first make the transformation 
y = x 2 /2a 2 a variable which follow the Gamma distribution g(y) = y/ye~ v /Y G). 

A random number from this distribution may be obtained using the so called Johnk’s 
algorithm which in this particular case becomes (denoting independent pseudorandom num¬ 
bers from a uniform distribution from zero to one by £*) 

i Put r — — In i.e. a random number from an exponential distribution. 

ii Calculate w\ = and w 2 = (with new uniform random numbers £2 and £3 each 
iteration, of course). 

iii If w = Wi + w 2 > 1 then go back to ii above. 

iv Put r — r — ^ In 

v Finally construct a\/2r as a random number from the Maxwell distribution with 
parameter r. 

Following the examples given above we may also use three independent random numbers 
from a standard normal distribution, z±, z 2 , and Z3, and construct 

r = - jzl + zl + zl 
a v 

However, this technique is not as efficient as the one outlined above. 

As a third alternative we could also use the cumulative distribution putting 

C(x)=? => P( !,£)=( => x = a ] fE I Up/ 

where P _ 1 (a,p), as above, denotes the value x where P(a,x) = p. This technique is, 
however, much slower than the alternatives given above. 

The first technique described above is not very fast but still the best alternative pre¬ 
sented here. Also it is less dependent on numerical algorithms (such as those to fold the 
inverse of the incomplete Gamma function) which may affect the precision of the method. 
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26 Moyal Distribution 

26.1 Introduction 

The Moyal distribution is given by 


-t=exp{—1 b + e-»)} 

for real values of z. A scale shift and a scale factor is introduced by making the standardized 
variable z — (x — n)/a and hence the distribution in the variable x is given by 



Without loss of generality we treat the Moyal distribution in its simpler form, f(z), in this 
document. Properties for g{x) are easily obtained from these results which is sometimes 
indicated. 

The Moyal distribution is a universal form for 

(a) the energy loss by ionization for a fast charged particle and 

(b) the number of ion pairs produced in this process. 

It was proposed by J. E. Moyal [35] as a good approximation to the Landau distribution. 
It was also shown that it remains valid taking into account quantum resonance effects and 
details of atomic structure of the absorber. 




Figure 18: The Moyal distribution 

The distribution, shown in figure 18, has a mode at z = 0 and is positively skewed. 
This implies that the mode of the x— distribution, g(x), is equal to the parameter //. 


91 






26.2 Normalization 


Making the transformation x — e z we find that 


0° OO 

J f( z )dz = j -^=exp{-|(-lnx + x)} 


OO _x 

1 f e 2 


o 


x 


V2n J \/x 
v 0 


dx = 


OO OO 

1 f e~ y n , If e~ y , 


r (9 


i =i 


where we have made the simple substitution y — x/2 in order to clearly recognize the 
Gamma function at the end. The distribution is thus properly normalized. 


26.3 Characteristic Function 

The characteristic function for the Moyal distribution becomes 



where we made the substitution x = e ~ z /2 in simplifying the integral. The last relation to 
the Gamma function with complex argument is valid when the real part of the argument 
is positive which indeed is true in the case at hand. 


26.4 Moments 

As in some other cases the most convenient way to find the moments of the distribution is 
via its cumulants (see section 2.5). We find that 

Ki = — In 2 — In 2 + 7 

K n = (-l)V n “ 1) (|) = (^-l)!(2 n -l)Cn for n>2 

with 7 fa 0.5772156649 Euler’s constant, ipw polygamma functions (see section 42.4) and 
( Riemann’s zeta-function (see page 59). Using the cumulants we find the lower order 
moments and the coefficients of skewness and kurtosis to be 

//] = E(z) — K\ — In 2 + 7 fa 1.27036 

7T 2 

y 2 = V{z) = k 2 = = — ~ 4.93480 

h3 = k 3 = = 14C 3 

= «4 + 3 /^ = ip ^ W ) + 3^ (1) (|) 2 = — K — 

28a/2^ 3 i 

71 = -« 1.53514 

7 T d 

72 = 4 


92 












For the distribution g(x) we have E(x) = aE(z) + /i, V(x) = a 2 V(z) or more generally 
central moments are obtained by fjL n (x) = a n /j, n (z) for n > 2 while 71 and 72 are identical. 

26.5 Cumulative Distribution 

Using the same transformations as was used above in evaluating the normalization of the 
distribution we write the cumulative (or distribution) function as 



where P is the incomplete Gamma function. 

Using the inverse of the cumulative function we find the median At ~ 0.78760 and the 
lower and upper quartiles Q\ ~ -0.28013 and Q 3 ~ 2.28739. 

26.6 Random Number Generation 

To obtain random numbers from the Moyal distribution we may either make use of the 
inverse to the incomplete Gamma function such that given a pseudorandom number £ we 
get a random number by solving the equation 



for z. If P 1 (a,p) denotes the value x where P(a,x ) = p then 


z =-In {2P- 1 (1,1-7} 


is a random number from a Moyal distribution. 

This is, however, a very slow method and one may instead use a straightforward reject- 
accept (or hit-miss) method. To do this we prefer to transform the distribution to get it 
into a finite interval. For this purpose we make the transformation tan y = x giving 



This distribution, shown in figure 19, has a maximum of about 0.911 and is limited to the 
interval -\<y<\. 

A simple algorithm to get random numbers from a Moyal distribution, either f(z) or 
g(x), using the reject-accept technique is as follows: 

a Get into £1 and £2 two uniform random numbers uniformly distributed between zero 
and one using a good basic pseudorandom number generator. 


93 











Figure 19: Transformed Moyal distribution 

b Calculate uniformly distributed variables along the horizontal and vertical direction 
by y = 7t£i — f and h = where h ma x = 0.912 is chosen slightly larger than the 

maximum value of the function. 

c Calculate z = tan y and the function value h(y). 

d If h < h(y ) then accept z as a random number from the Moyal distribution f(z) else 
go back to point a above. 

e If required then scale and shift the result by x = za + g in order to obtain a random 
number from g{x). 

This method is easily improved e.g. by making a more tight envelope to the distribution 
than a uniform distribution. The efficiency of the reject-accept technique outlined here is 
only 1/0.9127T ~ 0.35 (the ratio between the area of the curve and the uniform distribution). 
The method seems, however, fast enough for most applications. 
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27 Multinomial Distribution 


27.1 Introduction 


The Multinomial distribution is given by 


TV! 

p(r; N, k,p = —j—;- -p^p ?' 

ri\r 2 \ ■ ■ ■ r k \ 


k v ri 

■pi k = N <- n?7 

i=i '*• 


where the variable r is a vector with k integer elements for which 0 < r l < N and Y r % — AT. 
The parameters A^ > 0 and k > 2 are integers and p is a vector with elements 0 < Pi < 1 
with the constraint that YPi — 1- 

The distribution is a generalization of the Binomial distribution (k = 2) to many di¬ 
mensions where, instead of two groups, the N elements are divided into k groups each with 
a probability pi with i ranging from 1 to A;. A common example is a histogram with N 
entries in k bins. 


27.2 Histogram 

The histogram example is valid when the total number of events N is regarded as a fixed 
number. The variance in each bin then becomes, see also below, V{ri) = Npt( 1 — pi) ~ ?\ 
i 1 p,j < 1 which normally is the case for a histogram with many bins. 

If, however, we may regard the total number of events N as a random variable dis¬ 
tributed according to the Poisson distribution we find: Given a multinomial distribution, 
here denoted M(r; N,p ), for the distribution of events into bins for fixed N and a Poisson 
distribution, denoted P(N ; v). for the distribution of N we write the joint distribution 

V{r,N) = M(r;N,p)P(N;v) = 

= • • • (A(^)-»e-*) 

where we have used that 

k k 

Pi = 1 anC ^ r i = N 
1=1 1=1 

i. e. we get a product of independent Poisson distributions with means up t for each individual 
bin. 

As seen, in both cases, we find justification for the normal rule of thumb to assign the 
square root of the bin contents as the error in a certain bin. Note, however, that in principle 
we should insert the true value of r* for this error. Since this normally is unknown we use 
the observed number of events in accordance with the law of large numbers. This means 
that caution must be taken in bins with few entries. 


Ad 


ri!r 2 !... r k \ 


P?P? 



v N e~ v ' 


N\ 


27.3 Moments 

For each specific r t we may obtain moments using the Binomial distribution with q % = 1 —p, 
E(ri) = Npi and V(n ) = A^(l - pt) = Np& 
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The covariance between two groups are given by 


Cov(ri,rj) — —NpiPj for i % j 

27.4 Probability Generating Function 

The probability generating function for the multinomial distribution is given by 

/ k \ N 

G(z) = [itpiZij 

27.5 Random Number Generation 

The straightforward but time consuming way to generate random numbers from a multi¬ 
nomial distribution is to follow the definition and generate N uniform random numbers 
which are assigned to specific bins according to the cumulative value of the vector. 

27.6 Significance Levels 

To determine a significance level for a certain outcome from a multinomial distribution one 
may add all outcomes which are as likely or less likely than the probability of the observed 
outcome. This may be a non-trivial calculation for large values of N since the number of 
possible outcomes grows very fast. An alternative, although quite clumsy, is to generate 
a number of multinomial random numbers and evaluate how often these outcomes are as 
likely or less likely than the observed one. 

If we as an example observe the outcome r = (4,1,0, 0, 0, 0) for a case with 5 obser¬ 
vations in 6 groups (. N = 5 and k = 6) and the probability for all groups are the same 
Pi — 1/k — 1/6 we obtain a probability of p ~ 0.02. This includes all orderings of the same 
outcome since these are all equally probable but also all less likely outcomes of the type 
P= (5, 0,0, 0,0,0). 

If a probability calculated in this manner is too small one may conclude that the null 
hypothesis that all probabilities are equal is wrong. Thus if our confidence level is preset 
to 95% this conclusion would be drawn in the above example. Of course, the conclusion 
would be wrong in 2% of all cases. 

27.7 Equal Group Probabilities 

A common case or null hypothesis for a multinomial distribution is that the probability of 
the k groups is the same i.e. p = 1/k. In this case the multinomial distribution is simplified 
and since ordering become insignificant much fewer unique outcomes are possible. 

Take as an example a game where five dices are thrown. The probabilities for different 
outcomes may quite readily be evaluated from basic probability theory properly accounting 
for the 6 5 = 7776 possible outcomes. But one may also use the multinomial distribution 
with k — 6 and N = 5 to find probabilities for different outcomes. If we properly take 
care of combinatorial coefficients for each outcome we obtain (with zeros for empty groups 
suppressed) 
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name 

outcome 

# combinations 

probability 

one doublet 

2 ,1,1,1 

3600 

0.46296 

two doublets 

2 ,2,1 

1800 

0.23148 

triplets 

3,1,1 

1200 

0.15432 

nothing 

1 ,1,1,1,1 

720 

0.09259 

full house 

3,2 

300 

0.03858 

quadruplets 

4,1 

150 

0.01929 

quintuplets 

5 

6 

0.00077 

total 


7776 

1.00000 


The experienced dice player may note that the “nothing” group includes 240 combina¬ 
tions giving straights (1 to 5 or 2 to 6). From this table we may verify the statement from 
the previous subsection that the probability to get an outcome with quadruplets or less 
likely outcomes is given by 0.02006. 


Generally we have for N < k that the two extremes of either all observations in separate 
groups p sep or all observations in one group p a u 

k\ k k — 1 k — N T 1 

Psep = k N (k — N)\ = k ' ~k k 

1 

Pull ~ fcN-1 

which we could have concluded directly from a quite simple probability calculation. 

The first case is the formula which shows the quite well known fact that if 23 people or 
more are gathered the probability that at least two have the same birthday, i.e. 1 — p sep , is 
greater than 50% (using N = 23 and k = 365 and not bothering about leap-years or possible 
deviations from the hypothesis of equal probabilities for each day). This somewhat non- 
intuitive result becomes even more pronounced for higher values of k and the level above 
which p sep < 0.5 is approximately given by 

N ps 1.2 y/k 

For higher significance levels we may note that in the case with k = 365 the probability 
1 — Psep becomes greater than 90% at N — 41, greater than 99% at N = 57 and greater 
than 99.9% at N = 70 i.e. already for N << k a bet would be almost certain. 

In Fig. 20 we show, in linear scale to the left and logarithmic scale to the right, the 
lower limit on N for which the probability to have 1 — p sep above 50%, 90%, 99% and 
99.9.% for /e-values ranging up to 1000. By use of the gamma function the problem has 
been generalized to real numbers. Note that the curves start at certain values where k = N 
since for N > k it is impossible to have all events in separate groups 5 . 


5 This limit is at N = k = 2 for the 50%-curve, 3.92659 for 90%, 6.47061 for 99% and 8.93077 for 99.9% 
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28 Multinormal Distribution 


28.1 Introduction 


As a generalization of the normal or Gauss distribution to many dimensions we define the 
multinormal distribution. 

A multinormal distribution in x = {x\, X 2 , ■ ■ ■, x n } with parameters /i (mean vector) 
and V (variance matrix) is given by 


f(x\n, V ) 


g-5 (*-/*) V 1 (x-v) T 
(27r)f yJ\V\ 


The variance matrix V has to be a positive semi-definite matrix in order for / to be a proper 
probability density function (necessary in order that the normalization integral / f(x)dx 
should converge). 

If x is normal and V non-singular then (x — /i)l^ _1 (a; — /i) r is called the covariance form 
of x and has a ^-distribution with n degrees of freedom. Note that the distribution has 
constant probability density for constant values of the covariance form. 


The characteristic function is given by 

0(f) = e «n~¥ T vt 


where t is a vector of length n. 

28.2 Conditional Probability Density 

The conditional density for a fixed value of any Xi is given by a multinormal density with 
n — 1 dimensions where the new variance matrix is obtained by deleting the i:th row and 
column of V _1 and inverting the resulting matrix. 

This may be compared to the case where we instead just want to neglect one of the 
variables x t . In this case the remaining variables has a multinormal distribution with n — 1 
dimensions with a variance matrix obtained by deleting the i:th row and column of V. 

28.3 Probability Content 

As discussed in section 6.6 on the binomial distribution the joint probability content of a 
multidimensional normal distribution is different, and smaller, than the corresponding well 
known figures for the one-dimensional normal distribution. In the case of the binormal 
distribution the ellipse (see figure 2 on page 20) corresponding to one standard deviation 
has a joint probability content of 39.3%. 

The same is even more true for the probability content within the hyperellipsoid in the 
case of a multinormal distribution. In the table below we show, for different dimensions n, 
the probability content for the one (denoted z — 1), two and three standard deviation con¬ 
tours. We also give z -values z 1 , z 2 , and z 3 adjusted to give a probability content within the 
hyperellipsoid corresponding to the one-dimensional one, two, and three standard deviation 
contents (68.3%, 95.5%, and 99.7%). Finally 2 -value corresponding to joint probability 


99 



contents of 90%, 95% and 99% in ^ 90 , z 95 , and z 9 g, respectively, are given. Note that these 
probability contents are independent of the variance matrix which only has the effect to 
change the shape of the hyperellipsoid from a perfect hypersphere with radius z when all 
variables are uncorrelated to e.g. cigar shapes when correlations are large. 

Note that this has implications on errors estimated from a chi-square or a maximum 
likelihood fit. If a multiparameter confidence limit is requested and the chi-square minimum 
is at Xmi n 01 the logarithmic likelihood maximum at In £ max , one should look for the error 
contour at xi mn + z 2 or hi £ max — z 2 /2 using a z -value from the right-hand side of the 
table below. The probability content for a n-dimensional multinormal distribution as given 
below may be expressed in terms of the incomplete Gamma function by 

p = p{ It) 

as may be deduced by integrating a standard multinormal distribution out to a radius z. 
Special formulae for the incomplete Gamma function P(a, x) for integer and half-integer a 
are given in section 42.5.3. 



Probability content in % 


Adjusted 

^-values 


n 

* = 1 

z = 2 

3 = 3 

Zi 

Zl 

Z3 

Z 90 

Z95 

Z99 

1 

68.27 

95.45 

99.73 

1.000 

2.000 

3.000 

1.645 

1.960 

2.576 

2 

39.35 

86.47 

98.89 

1.515 

2.486 

3.439 

2.146 

2.448 

3.035 

3 

19.87 

73.85 

97.07 

1.878 

2.833 

3.763 

2.500 

2.795 

3.368 

4 

9.020 

59.40 

93.89 

2.172 

3.117 

4.031 

2.789 

3.080 

3.644 

5 

3.743 

45.06 

89.09 

2.426 

3.364 

4.267 

3.039 

3.327 

3.884 

6 

1.439 

32.33 

82.64 

2.653 

3.585 

4.479 

3.263 

3.548 

4.100 

7 

0.517 

22.02 

74.73 

2.859 

3.786 

4.674 

3.467 

3.751 

4.298 

8 

0.175 

14.29 

65.77 

3.050 

3.974 

4.855 

3.655 

3.938 

4.482 

9 

0.0562 

8.859 

56.27 

3.229 

4.149 

5.026 

3.832 

4.113 

4.655 

10 

0.0172 

5.265 

46.79 

3.396 

4.314 

5.187 

3.998 

4.279 

4.818 

11 

0.00504 

3.008 

37.81 

3.556 

4.471 

5.340 

4.156 

4.436 

4.972 

12 

0.00142 

1.656 

29.71 

3.707 

4.620 

5.486 

4.307 

4.585 

5.120 

13 

0.00038 

0.881 

22.71 

3.853 

4.764 

5.626 

4.451 

4.729 

5.262 

14 

0.00010 

0.453 

16.89 

3.992 

4.902 

5.762 

4.590 

4.867 

5.398 

15 

0.00003 

0.226 

12.25 

4.126 

5.034 

5.892 

4.723 

5.000 

5.530 

16 

0.00001 

0.1097 

8.659 

4.256 

5.163 

6.018 

4.852 

5.128 

5.657 

17 

ps 0 

0.0517 

5.974 

4.382 

5.287 

6.140 

4.977 

5.252 

5.780 

18 

ps 0 

0.0237 

4.026 

4.503 

5.408 

6.259 

5.098 

5.373 

5.900 

19 

ps 0 

0.0106 

2.652 

4.622 

5.525 

6.374 

5.216 

5.490 

6.016 

20 

ps 0 

0.00465 

1.709 

4.737 

5.639 

6.487 

5.330 

5.605 

6.129 

25 

ps 0 

0.00005 

0.1404 

5.272 

6.170 

7.012 

5.864 

6.136 

6.657 

30 

ps 0 

ps 0 

0.0074 

5.755 

6.650 

7.486 

6.345 

6.616 

7.134 


28.4 Random Number Generation 

In order to obtain random numbers from a multinormal distribution we proceed as follows: 
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• If re = {xi, x 2 , ■ ■ ■, x n } is distributed multinormally with mean 0 (zero vector) and 
variance matrix / (unity matrix) then each Xi (i = 1,2,..., n) can be found indepen¬ 
dently from a standard normal distribution. 

• If x is multinormally distributed with mean ji and variance matrix V then any linear 
combination y = Sx is also multinormally distributed with mean S/x and variance 
matrix SVS T , 

• If we want to generate vectors, y, from a multinormal distribution with mean fi and 
variance matrix V we may make a so called Cholesky decomposition of V, i. e. we find 
a triangular matrix S such that V = SS T . We then calculate y = Sx + /i with the 
components of x generated independently from a standard normal distribution. 

Thus we have found a quite nice way of generating multinormally distributed random 
numbers which is important in many simulations where correlations between variables may 
not be ignored. If many random numbers are to be generated for multinormal variables 
from the same distribution it is beneficial to make the Cholesky decomposition once and 
store the matrix S for further usage. 
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29 


Negative Binomial Distribution 


29.1 Introduction 

The Negative Binomial distribution is given by 

P(r\k,p ) = ^ _ 1 ^Jp k ( 1 - PY~ k 

where the variable r > k and the parameter k > 0 are integers and the parameter p 
(0 < p < 1 ) is a real number. 

The distribution expresses the probability of having to wait exactly r trials until k 
successes have occurred if the probability of a success in a single trial is p (probability of 
failure q — 1 — p). 

The above form of the Negative Binomial distribution is often referred to as the Pascal 
distribution after the french mathematician, physicist and philosopher Blaise Pascal (1623- 
1662). 

The distribution is sometimes expressed in terms of the number of failures occurring 
while waiting for k successes, n — r — k , in which case we write 

p(n; k,p) = ^ + ^ 1 jp fe (l - p) n 

where the new variable n > 0 . 

Changing variables, for this last form, to n and k instead of p and k we sometimes use 

_ (n + k — l\ n n k k (n + k — l\ / n \ n ( k \ k 
^ _ \ n ) (n + k) n+k ~ \ n )\n + k) \n + k) 

The distribution may also be generalized to real values of k, although this may seem 
obscure from the above probability view-point (“fractional success”), writing the binomial 
coefficient as (n + k — l)(n + k — 2 ) • • • (k + 1 )k/n\. 


29.2 Moments 


In the first form given above the expectation value, variance, third and fourth central 
moments of the distribution are 


k kq kq(2-p) kq(p 2 

E(r) = —, V(r) = —, /i 3 = ---, and /x 4 =- 

p p 2 p 6 


The coefficients of skewness and kurtosis are 


2 — p p 2 — 6p + 6 

7i = ~n= and 72 = -y- 

\Jkq kq 


6p + 6 + 3 kq) 

p4 


In the second formulation above, p(n), the only difference is that the expectation value 
becomes 


E(n) = E(r) — k 


k( 1 — p) 
P 


kq 

P 
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while higher moments remain unchanged as they should since we have only shifted the scale 
by a fixed amount. 

In the last form given, using the parameters n and k, the expectation value and the 
variance are 


E(n) — n and V(n) 


n + 


n 2 


29.3 Probability Generating Function 

The probability generating function is given by 


G{z) = 


pz 


zq. 


in the first case (p(r)) and 


G(z) = 


P 


zq 


.i + (i-*) 


in the second case (p(n)) for the two different parameterizations. 


29.4 Relations to Other Distributions 

There are several interesting connections between the Negative Binomial distribution and 
other standard statistical distributions. In the following subsections we briefly address 
some of these connections. 


29.4.1 Poisson Distribution 

Regard the negative binomial distribution in the form 

f n + k — l\ / 1 


p(n; n, k) = 


n 


_T / n/k N 

1 +n/k) \l +n/k / 


where n > 0 , k > 0 and n > 0 . 

As k —> oo the three terms become 


t 'n + k — l\ (n + k — 1 ) (n + k — 2 )... k k n 
i n ) n\ n\ 


1 + n/k 


; n k{k T 1) (nk(k + l)(k + 2) f n\ 3 
= l ~ k k + ^T~ \k) ~ 


k r - 


n/k 


6 


n n 


k) +••• 


and 


1 + n/k ) 

where, for the last term we have incorporated the factor k n from the first term. 
Thus we have shown that 


lim p{n ; n, k ) =- - 

k —xx) 77,1 
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i.e. a Poisson distribution. 

This “proof” could perhaps better be made using the probability generating function 
of the negative binomial distribution 


G(z) 


_ 1 

1 — zq) \l — (z—l)n/k 


Making a Taylor expansion of this for (z — 1 )n/k C 1 we get 

G(z) = 1 + (z - l)n + + (k + m + 2) (z - 1)V + ^ ^ 

k 2 k 2 6 

as k —» cx). This result we recognize as the probability generating function of the Poisson 
distribution. 


29.4.2 Gamma Distribution 

Regard the negative binomial distribution in the form 

p(n-,k,p) = 


n + k - l\ k 
)p q 


n 


where n > 0, k > 0 and 0 < p < 1 and where we have introduced q — 1 — p. If we change 
parameters from k and p to k and n = kq/p this may be written 

k 


p(n ; n, k ) = 


'n + k — 1\ / 1 \ / n/k 

, n ill + n/kj ll + n/k 


Changing variable from n to z = n/n we get ( dn/dz = n) 

, _ , x , ,, d/ri _ (zn + k — l\ { 1 

p(z;n,k) = p{rr,n,k )— — n\ 

az V 


n/k 


zn 


_ (zn + k — 1 )(zn + k — 2)... (zn + 1) 
= n - 


l + n/kj \l + n/kj 

1 \ k ( 1 


n k k 


7zr\k—l 


[zn 


m 

^ \ fc 


k k 


k + nj \k/n + 1 


T(k) \k + nj \k/n +1 




n 


~k—lpk 


r(fc) \k + nj \k/n+1 / 
where we have used that for k -C n —> oo 


n x k 


k + n 


1 and, 


k/n + 1 


= 1 — 2Ti_ + 

n 


_k zn(zn + 1) /A;\ 2 zn(zn + l)(zn + 2) /A;\ 3 


+ ... 


n. 


n , 


1 — zk + 


z 2 A 2 ^ 3 /c 3 


+ •.. = e 


— kz 


2 6 

as n —> oo. 

Thus we have “shown” that as n —> oo and n>fcwe obtain a gamma distribution in 
the variable z = n/n. 
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29.4.3 Logarithmic Distribution 

Regard the negative binomial distribution in the form 


P(n;k,p ) = 


'n + k — V 


k „n 


n 


p q 


where n > 0, k > 0 and 0 < p < 1 and where we have introduced q = 1 — p. 
The probabilities for n = 0,1, 2, 3... are given by 


{ P(0), p(l), p( 2 ), p( 3 ), • • •} = p k { 1 , kq, 


k(k T 1 ) 2 k(k + l)(k + 2) 


2 ! 


-q , 


3! 


q. 


if we omit the zero class (n=0) and renormalize we get 

kp k f k + 1 2 (k+l)(k + 2) 3 

’ 0, q, q 2 , ^- -q*, 


1 ~P k K 

and if we let k —> 0 we finally obtain 

1 


2! 3! 


In p 


0 ) 9) 2 ’ 3 ’ 


where we have used that 


lim 


k^O p~ k — 1 lnp 

which is easily realized expanding p~ k = e~ klnp into a power series. 
This we recognize as the logarithmic distribution 


p(n-,p) = 


1 (i -pY 

In p n 


thus we have shown that omitting the zero class and letting k 0 the negative binomial 
distribution becomes the logarithmic distribution. 


29.4.4 Branching Process 

In a process where a branching occurs from a Poisson to a logarithmic distribution the most 
elegant way to determine the resulting distribution is by use of the probability generating 
function. The probability generating functions for a Poisson distribution with parameter 
(mean) /j and for a logarithmic distribution with parameter p (q — 1 — p) are given by 

Gp(z) = ^ and Gl(z) — ln(l — zq)/ ln(l — q) — cdn(l — zq) 

where p > 0, 0 < q < 1 and a = 1/ lnp. 

For a branching process in n steps 

G(z) — G 1 (G 2 (... G n -i(G n (z ))...)) 
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where Gk{z) is the probability generating f un ction in the k:th step. In the above case this 
gives 


G(z) = Gp(Gl(z)) = exp {p(a ln(l — zq) — 1)} = 

= exp {ap ln(l - zq) - p} — (1 - = 

= {l-z q )-\l-q) k =p k /{l-zq) k 

where we have put k = — ap. This we recognize as the probability generating function of 
a negative binomial distribution with parameters k and p. 

We have thus shown that a Poisson distribution with mean p branching into a loga¬ 
rithmic distribution with parameter p gives rise to a negative binomial distribution with 
parameters k = —ap = —p/ In p and p (or n = kq/p ). 

Conversely a negative binomial distribution with parameters k and p or n could arise 
from the combination of a Poisson distribution with parameter p = —klnp = fcln(l + |) 
and a logarithmic distribution with parameter p and mean njp. 

A particle physics example would be a charged multiplicity distribution arising from the 
production of independent clusters subsequently decaying into charged particles according 
to a logarithmic distribution. The UA5 experiment [36] found on the SppS collider at 
CERN that at a centre of mass energy of 540 GeV a negative binomial distribution with 
n = 28.3 and k = 3.69 fitted the data well. With the above scenario this would correspond 
to ~ 8 clusters being independently produced (Poisson distribution with p = 7.97) each one 
decaying, according to a logarithmic distribution, into 3.55 charged particles on average. 


29.4.5 Poisson and Gamma Distributions 

If a Poisson distribution with mean p > 0 

p(n ; p) =- - — for n > 0 


n\ 


is weighted by a gamma distribution with parameters a > 0 and b > 0 

a(ax) b_1 e _a3: 

f( x ', a, b) — - - for x>0 

r(6) 


we obtain 
V(n) 


OO 


OO 


J p{n\p)f{p\a,b)dp = J 

o o 


e M p n a(ap) b l e 
rd f(6) rf/i = 


b OO b 

= + 5 _ 1)!(o +!)-(»«) = 

(n + b — 1\ / a \ b f 1 \ n 
\ n yVa + 1 / Va + 1 / 


which is a negative binomial distribution with parameters p = , i.e. q = 1 — p = ’ 2 , 

and k = b. If we aim at a negative binomial distribution with parameters n and k we should 
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thus weight a Poisson distribution with a gamma distribution with parameters a = k/n 
and b — k. This is the same as superimposing Poisson distributions with means coming 
from a gamma distribution with mean n. 

In the calculation above we have made use of integral tables for the integral 


oo 

J x n e~ ax dx = n!a" (n+1) 


29.5 Random Number Generation 

In order to obtain random numbers from a Negative Binomial distribution we may use the 
recursive formula 

qr q(k + n) 

plr + 1) = plr) - - or pin + 1) = pin)— - 

r T1 — k n + 1 

for r — k,k + 1 ,.. . and n — 0, 1 ,... in the two cases starting with the first term (p(k) or 
p( 0)) being equal to p k . This technique may be speeded up considerably, if p and k are 
constants, by preparing a cumulative vector once for all. 

One may also use some of the relations described above such as the branching of a 
Poisson to a Logarithmic distribution 6 7 or a Poisson distribution weighted by a Gamma dis¬ 
tribution'. This, however, will always be less efficient than the straightforward cumulative 
technique. 


6 Generating random numbers from a Poisson distribution with mean p = — klnp branching to a Log¬ 
arithmic distribution with parameter p will give a Negative Binomial distribution with parameters k and 

P- 

7 Taking a Poisson distribution with a mean distributed according to a Gamma distribution with pa¬ 
rameters a = k/n and b = k. 
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30 Non-central Beta-distribution 


30.1 Introduction 


The non-central Beta-distribution is given by 


f(x;p,q ) = £< 


r =0 


(f)" x p+r ~ 1 (l — x) 9_1 
r! B (p + r, g) 


where p and g are positive real quantities and the non-centrality parameter A > 0. 

In figure 21 we show examples of a non-central Beta distribution with p = | and q = 3 
varying the non-central parameter A from zero (an ordinary Beta distribution) to ten in 
steps of two. 



Figure 21: Graph of non-central Beta-distribution for p — |, q — 3 and some values of A 


30.2 Derivation of distribution 


If y m and y n are two independent variables distributed according to the chi-squared distri¬ 
bution with m and n degrees of freedom, respectively, then the ratio y m /(y m + y n ) follows 
a Beta distribution with parameters p = y and q — If instead y m follows a non-central 
chi-square distribution we may proceed in a similar way as was done for the derivation of 
the Beta-distribution (see section 4.2). 

We make a change of variables to x = y m / ( y m + y n ) and y = y m + yn which implies that 
y m = xy and y n = y(l - x) obtaining 


f(x,y) 


dy m 

dy m 

dx 

By 

9y n 

dy n 

dx 

By 

y 

X 

-y 

1 — X 


f (l/m, yn) 


OO 


£e- 


A 

2 



T\ 
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= y E 


e 2 


(x\ r ( 

\ 2 / ( 


?+r-1 


e 2 


r=0 


rM 


2F(f +r) 


= £ 


e 2 


(§)' 


—+r-l 
X 2 


(l-x) 


r! fi(f+r,f) 2r(^ + r) 



In the last braces we see a chi-square distribution in y with m + n + 2r degrees of 
freedom and integrating f(x,y) over y in order to get the marginal distribution in x gives 
us the non-central Beta-distribution as given above with p — m/2 and q = n/2. 

If instead y n were distributed as a non-central chi-square distribution we would get 
a very similar expression (not amazing since y m /(ym + Dn) = 1 — y n /(y m + y n )) but it’s 
the form obtained when y m is non-central, that is normally referred to as the non-central 
Beta-distribution. 


30.3 Moments 


Algebraic moments of the non-central Beta-distribution are given in terms of the hyperge¬ 
ometric function 2 F 2 as 


E(x k ) = x k f(x;p,q)dx = / ^ e 2 


(A 


X 


,p-\-r-\-k —1 


(1 - x ) 


q -1 


dx = 


r —0 


T ! 


( 5 )’ B(p + r + k,q) “ 

= Z^ e 2 


^ r! £ (p + r, g) 


£ (p + r, g) 

(f) r (p + r + k) F(p+r + q) 


ay 


, r=0 r\ r (p + r) V (p + r + q + k) 
(p + r + k — 1) • • • (p + r + 1) (p + r) 


v e -f 

r • (p + q + r + k - 1) ■ ■ ■ (p + q + r + l)(p + q + r) 

_a r(p + k) r (p + q) 

e 2 


r(p) T(p + g + A;) 


• 2 £2 (p + q, P + k] p, p + g + k] 


However, to evaluate the hypergeometric function involves a summation so it is more effi¬ 
cient to directly use the penultimate expression above. 


30.4 Cumulative distribution 

The cumulative distribution is found by straightforward integration 


F(x) 



r! B (p + r, g) ^ 



ip + A 9) 


30.5 Random Number Generation 

Random numbers from a non-central Beta-distribution with integer or half-integer p— and 
q— values is easily obtained using the definition above i.e. by using a random number from 
a non-central chi-square distribution and another from a (central) chi-square distribution. 
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31 Non-central Chi-square Distribution 

31.1 Introduction 


If we instead of adding squares of n independent standard normal, IV(0,1), variables, 
giving rise to the chi-square distribution with n degrees of freedom, add squares of IV(/q, 1) 
variables we obtain the non-central chi-square distribution 


/0; n, A ) = J2' 


r=0 



/(x; n + 2 r) 


— ] ^—x%~ 1 e~^ x+x) V r (2 + r ) 
2 tr(|) r = 0 (2r)! r(f +r) 


where A = J2hj is the non-central parameter and /(x; n ) the ordinary chi-square distri¬ 
bution. As for the latter the variable x > 0 and the parameter n a positive integer. The 
additional parameter A > 0 and in the limit A = 0 we retain the ordinary chi-square dis¬ 
tribution. According to [2] pp 227-229 the non-central chi-square distribution was first 
introduced by R. A. Fisher in 1928. In figure 22 we show the distribution for n — 5 and 
non-central parameter A = 0,1,2, 3,4, 5 (zero corresponding to the ordinary chi-squared 
distribution). 



Figure 22: Graph of non-central chi-square distribution for n = 5 and some values of A 


31.2 Characteristic Function 


The characteristic function for the non-central chi-square distribution is given by 

ex p(irh 

( 1-2 R)f 

but even more useful in determining moments is 

, . . it\ n . , 

M(t)= m25- 2 1,l(1 - 2!t) 
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from which cumulants may be determined in a similar manner as we normally obtain 
algebraic moments from <p(t) (see below). 

By looking at the characteristic f un ction one sees that the sum of two non-central chi- 
square variates has the same distribution with degrees of freedoms as well as non-central 
parameters being the sum of the corresponding parameters for the individual distributions. 


31.3 Moments 

To use the characteristic function to obtain algebraic moments is not trivial but the cumu¬ 
lants (see section 2.5) are easily found to be given by the formula 

K r = 2 r-1 (r — l)!(n + rX) for r>l 

from which we may find the lower order algebraic and central moments (with a — n + A 
and b = A/a) as 

p\ = K\ = a = n + A 

p 2 = 1^2 — 2a(l + b) = 2 (n + 2A) 

fj, 3 = k 3 = 8 a(l + 2b) = 8 [n + 3A) 

/U 4 = k 4 + 3^2 = 48 (n + 4A) + 12{n + 2A ) 2 

fj, 5 = ac 5 + 10k 3 k 2 = 384(n + 5A) + 160(n + 2A)(n + 3A) 

fiQ — kq + 15k 4 K2 + lOs^ -)- 15k 2 = 

= 3840(n + 6 A) + 1440(n + 2A)(n + 4A) + 640(n + 3A ) 2 + 120(n + 2A ) 3 
/ 2 \§ 1 + 2 b_ 8(n + 3A) 

\l + b) y/a [2(n + 2 A)] ^ 

12 1 + 3 b _ 12(n + 4A) 
a (1 + b) 2 (n + 2A ) 2 


31.4 Cumulative Distribution 

The cumulative, or distribution, function may be found by 

A r r(§ + r) 


F(x) = 


A 

-e '2 


E 


2 tr(|) r ^ 0 (2r)!r(f + r) 

A r r(| + r) 


u 




e 2 du = 


X 

-e '2 


E 


= e 2 


2ir(i) +„ m r (I + r) 

(i) 


2 ,+r 7 (?+’■• !) 


£+ L +I + r,f) 

—n 1 • 


r=0 


31.5 Approximations 

An approximation to a chi-square distribution is found by equating the first two cumulants 
of a non-central chi-square distribution with those of p times a chi-square distribution. 


Ill 












Here p is a constant to be determined. The result is that with 


P 


n + 2A 
n + A 


1 + 


A 

n + A 


and n* 


(n + A) 2 A 2 

---— = 77 ,- 1 - - 

n + 2A n + 2A 


we may approximate a non-central chi-square distribution f(x;n, A) with a (central) chi- 
square distribution in x/p with n* degrees of freedom (n* in general being fractional). 
Approximations to the standard normal distribution are given using 


z = 




or 


z = 


(A) 3 _ 

A 2 1+6 

\<v 

9 a 


2 _ l+b 
9 a 


31.6 Random Number Generation 


Random numbers from a non-central chi-square distribution is easily obtained using the 
definition above by e.g. 

• Put /1 = \J\/n 

• Sum n random numbers from a normal distribution with mean /j and variance unity. 
Note that this is not a unique choice. The only requirement is that A = 

• Return the sum as a random number from a non-central chi-square distribution with 
n degrees of freedom and non-central parameter A. 


This ought to be sufficient for most applications but if needed more efficient techniques 
may easily be developed e.g. using more general techniques. 
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32 Non-central F-Distribution 


32.1 Introduction 


If X\ is distributed according to a non-central chi-square distribution with m degrees of 
freedom and non-central parameter A and X 2 according to a (central) chi-square distribution 
with n degrees of freedom then, provided X\ and x 2 are independent, the variable 

pi = xi/m 
x 2 /n 


is said to have a non-central F-distribution with m, n degrees of freedom (positive integers) 
and non-central parameter A > 0. As the non-central chi-square distribution it was first 
discussed by R. A. Fisher in 1928. 

This distribution in F' may be written 


f(F'-,m,n, A) = e^f;l()V AT+d 


r =0 


r(f +r) r (|) 


. ™ _L r 

m\ 2 ^ 
n J 


(j F')f ~ 1 + r 


( 1 + 


mF' \ 2 
n ) 


5(m+n)+r 


In figure 23 we show the non-central F-distribution for the case with m = 10 and n — 5 
varying A from zero (an ordinary, central, F-distribution) to five. 



F 

Figure 23: Graph of non-central F-distribution for m — 10, n — 5 and some values of A 

When m — 1 the non-central F-distribution reduces to a non-central ^-distribution 
with h 2 = A. As n —> 00 then nF' approaches a non-central chi-square distribution with m 
degrees of freedom and non-central parameter A. 
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32.2 Moments 

Algebraic moments of the non-central F-distribution may be achieved by straightforward, 
but somewhat tedious, algebra as 


OO 

E(F ' k ) = J x k f(x;m,n, \)dx — 


/ n 

= e 2 — 

V m 


fc r (j-fc) ” i /A\ r r(f + r + fc) 
r(f) ,=o H V 2 / r(f + r) 


m n — 2 
n \ 2 1 


an expression which may be used to find lower order moments (defined for n > 2k) 
E(F') , " m + A 

E(F' 2 ) = 

E(F' 3 ) = 

F(F /4 ) : 


m/ (n — 2)(n 
n \ 3 1 

m / (n — 2) (n — 4) (n 
n \ 4 1 


— | A 2 + (2A + m) (m + 2)| 


m / (n — 2) (n — 4) (n — 6) (n — 8) 


— • {A 3 + 3 (m + 4) A 2 + (3A + m)(m + 4 ){m + 2)| 

• | A 4 + 4 (m + 6) A 3 + 6(m + 6)(m + 4)A 2 + 


+ (4A + m)(m + 6 )(m + 4)(m + 2)} 


C(F') = - 


n 


(A + m) 


mj (n — 2)(n — 4) [ n — 2 


+ 2A + m 


32.3 Cumulative Distribution 

The cumulative, or distribution, function may be found by 

X 

F(x) = J u k f(u;m,n, X)du = 


o 

c-*£;i. f A V r ^ +r ) J m ^ +r 


r =0 


r! V 2 


r (f + r)r(f) \n) J + 2 


r M ?-l+r+fc 


-+r 


-du = 


OO 


.^1 /'aV^(t + uf) ^ 

e 2 EuM utEta—E = e 2 EEr4(f + Afj 


r=o r[ \ 2 ) F(f + r,|) “o r! 


with 


9 = 


■y _j_ rnx 
' n 


32.4 Approximations 

Using the approximation of a non-central chi-square distribution to a (central) chi-square 
distribution given in the previous section we see that 
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is approximately distributed according to a (central) F-distribution with m* — m + 
and n degrees of freedom. 

Approximations to the standard normal distribution is achieved with 


Zl 


or z 2 


F' - E(F') 

s/np) 


■pi _ n(m+ A) 
m(n— 2) 


n 

m 


2 f (m+A) 2 

(n— 2)(n— 4) \ n —2 


+ m + 2A| 


/ mF' \ 3 / -| _ 2 A _ /" i _ 2 m+2\ \ 

\m+A/ \ 9n / V 9 (m+A) 2 / 

r 2 -| i 

2 # m+2A i 2 # ( mF' \ 3 
9 (m+A) 2 ' 9 n \m+A/ 


1 

2 


32.5 Random Number Generation 

Random numbers from a non-central chi-square distribution is easily obtained using the 
definition above i.e. by using a random number from a non-central chi-square distribution 
and another from a (central) chi-square distribution. 
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33 


Non-central ^-Distribution 


33.1 Introduction 


If x is distributed according to a normal distribution with mean 5 and variance 1 and y 
according to a chi-square distribution with n degrees of freedom (independent of x) then 


t' = 


x 



has a non-central t-distribution with n degrees of freedom (positive integer) and non-central 
parameter 5 (real). 

We may also write 

£/ _ z + 5 
\jwfn 

where 2 is a standard normal variate and w is distributed as a chi-square variable with n 
degrees of freedom. 

The distribution is given by (see comments on derivation in section below) 


f(tn, 5) 



“ ( t'5Y 

/ V I I 

r=o r!n 2 



n-\-i — 1-1 

2 §r(^±f±i) 


In figure 24 we show the non-central t -distribution for the case with n = 10 varying 6 
from zero (an ordinary ^distribution) to five. 



Figure 24: Graph of non-central ^distribution for n — 10 and some values of 5 


This distribution is of importance in hypotheses testing if we are interested in the 
probability of committing a Type II error implying that we would accept an hypothesis 
although it was wrong, see discussion in section 38.11 on page 146. 


33.2 Derivation of distribution 

Not many text-books include a formula for the non-central f-distribution and some turns 
out to give erroneous expressions. A non-central F-distribution with rri — 1 becomes a 
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non-central ^-distribution which then may be transformed to a non-central A-distribution. 
However, with this approach one easily gets into trouble for t' < 0. Instead we adopt 
a technique very similar to what is used in section 38.6 to obtain the normal (central) 
A-distribution from a A-ratio. 

The difference in the non-central case is the presence of the 5-parameter which intro¬ 
duces two new exponential terms in the equations to be solved. One is simply exp(—5 2 /2) 
but another factor we treat by a serial expansion leading to the p.d.f. above. This may not 
be the ‘best’ possible expression but empirically it works quite well. 


33.3 Moments 


With some effort the p.d.f. above may be used to calculate algebraic moments of the 
distribution yielding 


00 5 r 22 


E « k) = (W) « s £ ) 


r=0 


where the sum should be made for odd (even) values of r if k is odd (even). This gives for 
low orders 


Pi 


AH — 


AH — 


AH — 


r(V) 


n - 

2 4(f) 


« r (¥) 


3 

71 2 


2r(|) 

v^r(==a) 




4 r (i) 

T(t) 

4 T 


n — 2 
S (3 + S 2 ) 


n 


yV f^ 4 + 65 2 + 3 ) = - Z -- (5 4 + 65 2 + 3) 

(j) V J (n — 2)(n — 4) ' > 


from which expressions for central moments may be found e.g. the variance 

2 ' 




33.4 Cumulative Distribution 

The cumulative, or distribution, function may be found by 
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(=¥){*, +»2 /,(=£,§)} 


where ,S] are and S 2 are signs differing between cases with positive or negative t as well as 
odd or even r in the summation. The sign .Si is —1 if r is odd and +1 if it is even while S 2 
is +1 unless t < 0 and r is even in which case it is —1. 


33.5 Approximation 

An approximation is given by 

. = 

\A + is 

which is asymptotically distributed as a standard normal variable. 


33.6 Random Number Generation 

Random numbers from a non-central f-distribution is easily obtained using the definition 
above i.e. by using a random number from a normal distribution and another from a chi- 
square distribution. This ought to be sufficient for most applications but if needed more 
efficient techniques may easily be developed e.g. using more general techniques. 
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34 Normal Distribution 


34.1 Introduction 


The normal distribution or, as it is often called, the Gauss distribution is the most impor¬ 
tant distribution in statistics. The distribution is given by 





where fi is a location parameter, equal to the mean, and cr the standard deviation. For 
fi = 0 and cr = 1 we refer to this distribution as the standard normal distribution. In many 
connections it is sufficient to use this simpler form since /i and cr simply may be regarded 
as a shift and scale parameter, respectively. In figure 25 we show the standard normal 
distribution. 



z 


Figure 25: Standard normal distribution 

Below we give some useful information in connection with the normal distribution. 
Note, however, that this is only a minor collection since there is no limit on important and 
interesting statistical connections to this distribution. 

34.2 Moments 

The expectation value of the distribution is E(x) = fi and the variance V{x) = a 2 . 

Generally odd central moments vanish due to the symmetry of the distribution and 
even central moments are given by 

= §^ ff2r = (2r - 1 ) !!ff2r 

for r > 1. 
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It is sometimes also useful to evaluate absolute moments E(\x\ n ) for the normal distri¬ 
bution. To do this we make use of the integral 


OO 



— OO 


which if differentiated k times with respect to a yields 




—OO 

In our case a = 1/2 a 2 and since even absolute moments are identical to the algebraic 
moments it is enough to evaluate odd absolute moments for which we get 


E(\x\ 2k+1 ) = —^= Jx 2k+1 e~^dx = \^ {2a ~ )k+1 ]y k e~ y dy 
<rv2vr J \ tt 2a J 

v o o 

The last integral we recognize as being equal to k\ and we finally obtain the absolute 
moments of the normal distribution as 

J (n — l)!!a n for n — 2k 

^ | yf^2 k k\a 2k+1 for n — 2k + 1 

l V 


The half-width at half-height of the normal distribution is given by V2 In 2cr ~ 1.177cr 
which may be useful to remember when estimating a using a ruler. 


34.3 Cumulative Function 

The distribution function, or cumulative function, may be expressed in term of the incom¬ 
plete gamma function P as 


_{\+\ p {bi) if ->o 

or we may use the error function erf(z/y/2 ) in place of the incomplete gamma function. 


34.4 Characteristic Function 


The characteristic function for the normal distribution is easily found from the general 
definition 


f{t) = E ( e ltx ) = exp [fnt - \a 2 t 2 } 
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34.5 Addition Theorem 


The so called Addition theorem for normally distributed variables states that any linear 
combination of independent normally distributed random variables ay (i = 1,2 ,... ,n) is 
also distributed according to the normal distribution. 

If each x, is drawn from a normal distribution with mean /q and variance of then regard 
the linear combination 

n 

S ^ ^ 

i =1 

where a* are real coefficients. Each term has characteristic function 

0 oi*i(f) = exp {(oifjjit - l(a 2 cr 2 )t 2 } 
and thus S has characteristic function 

which is seen to be a normal distribution with mean an d variance X^ofof. 

34.6 Independence of x and s 2 

A unique property of the normal distribution is the independence of the sample statistics 
x and s 2 , estimates of the mean and variance of the distribution. Recall that the definition 
of these quantities are 


(ps(t-) = YIKM = ex P 


2=1 


^ n i n 

x = — 'Y' Xi and s 2 = - (x l — x) 2 

n - 1 ^ 

where x is an estimator of the true mean /i and s 2 is the usual unbiased estimator for the 
true variance a 2 . 

For a population of n events from a normal distribution x has the distribution N (/x, a 2 /n) 
and (n — 1 )s 2 /a 2 is distributed according to a chi-square distribution with n — 1 degrees of 
freedom. Using the relation 

JU / Xj - /i \ 2 _ (n - l)g 2 / x - /i \ 2 

hi V ^ \a/y/n) 

and creating the joint characteristic function for the variables (n — l)s 2 /a 2 and (y 7 n(x — 
n)/a 2 ) 2 one may show that this function factorizes thus implying independence of these 
quantities and thus also of x and s 2 . 

In summary the “independence theorem” states that given n independent random vari¬ 
ables with identical normal distributions the two statistics x and s 2 are independent. Also 
conversely it holds that if the mean x and the variance s 2 of a random sample are indepen¬ 
dent then the population is normal. 
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34.7 Probability Content 

The probability content of the normal distribution is often referred to in statistics. When 
the term one standard deviation is mentioned one immediately thinks in terms of a proba¬ 
bility content of 68.3% within the symmetric interval from the value given. 

Without loss of generality we may treat the standard normal distribution only since 
the transformation from a more general case is straightforward putting z — (x — ff)/ a. In 
different situation one may want to find 

• the probability content, two-side or one-sided, to exceed a certain number of standard 
deviations, or 


• the number of standard deviations corresponding to a certain probability content. 
In calculating this we need to evaluate integrals like 

r l 


a = 




e 2 dt 


There are no explicit solution to this integral but it is related to the error function (see 
section 13) as well as the incomplete gamma function (see section 42). 




e-4 =er/(4=) =P(^ 




2 2 


These relations may be used to calculate the probability content. Especially the error 
function is often available as a system function on different computers. Beware, however, 
that it seems to be implemented such that erf(z) is the symmetric integral from — z to z 
and thus the a /2 factor should not be supplied. Besides from the above relations there are 
also excellent approximations to the integral which may be used. 

In the tables below we give the probability content for exact ^-values (left-hand table) 
as well as z -values for exact probability contents (right-hand table). 


z z oo z z oo 


z 

/ 

—OO 

/ 

—z 

/ 

z 

z 

/ 

— OO 

/ 

—z 

/ 

Z 

0.0 

0.50000 

0.00000 

0.50000 


0.00000 

0.5 

0.0 

0.5 

0.5 

0.69146 

0.38292 

0.30854 


0.25335 

0.6 

0.2 

0.4 

1.0 

0.84134 

0.68269 

0.15866 


0.67449 

0.75 

0.5 

0.25 

1.5 

0.93319 

0.86639 

0.06681 


0.84162 

0.8 

0.6 

0.2 

2.0 

0.97725 

0.95450 

0.02275 


1.28155 

0.9 

0.8 

0.1 

2.5 

0.99379 

0.98758 

6.210 • 10 -3 


1.64485 

0.95 

0.9 

0.05 

3.0 

0.99865 

0.99730 

1.350 • 10 -3 


1.95996 

0.975 

0.95 

0.025 

3.5 

0.99977 

0.99953 

2.326 ■ 10~ 4 


2.32635 

0.99 

0.98 

0.01 

4.0 

0.99997 

0.99994 

3.167 ■ 10 -5 


2.57583 

0.995 

0.99 

0.005 

4.5 

1.00000 

0.99999 

3.398 • 10” 6 


3.09023 

0.999 

0.998 

0.001 

5.0 

1.00000 

1.00000 

2.867 ■ 10 -7 


3.29053 

0.9995 

0.999 

0.0005 

6.0 

1.00000 

1.00000 

9.866 ■ 10~ 10 


3.71902 

0.9999 

0.9998 

0.0001 

7.0 

1.00000 

1.00000 

1.280 • 10~ 12 


3.89059 

0.99995 

0.9999 

0.00005 

8.0 

1.00000 

1.00000 

6.221 • 10 -16 


4.26489 

0.99999 

0.99998 

0.00001 


122 












It is sometimes of interest to scrutinize extreme significance levels which implies inte¬ 
grating the far tails of a normal distribution. In the table below we give the number of 
standard deviations, z, required in order to achieve a one-tailed probability content of 10 _n . 


z- 

-values for which -A= 

V Z7T 

OO 

fe~ 

Z 

z2 ' 2 dz = 

o 

1 

3 

for n = 

1,2, 

■■■,23 

n 

z 

n 

z 

n 

z 

n 

z 

n 

z 

1 

1.28155 

6 

4.75342 

11 

6.70602 

16 

8.22208 

21 

9.50502 

2 

2.32635 

7 

5.19934 

12 

7.03448 

17 

8.49379 

22 

9.74179 

3 

3.09023 

8 

5.61200 

13 

7.34880 

18 

8.75729 

23 

9.97305 

4 

3.71902 

9 

5.99781 

14 

7.65063 

19 

9.01327 



5 

4.26489 

10 

6.36134 

15 

7.94135 

20 

9.26234 




Below are also given the one-tailed probability content for a standard normal distribu¬ 
tion in the region from z to oo (or — oo to —z). The information in the previous as well as 
this table is taken from [26]. 


Probability content Q(z) 



OW ry . 

f e~ z ! 2 dz for z 

Z 


1 , 2 ,.. 


.,50,60,..., 100,150,...,500 


0 

-log Q(z) 

0 

-log Q(z) 

z 

— log Q(z) 

z 

— log Q(z) 

z 

-log Q(z) 

1 

0.79955 

14 

44.10827 

27 

160.13139 

40 

349.43701 

80 

1392.04459 

2 

1.64302 

15 

50.43522 

28 

172.09024 

41 

367.03664 

90 

1761.24604 

3 

2.86970 

16 

57.19458 

29 

184.48283 

42 

385.07032 

100 

2173.87154 

4 

4.49934 

17 

64.38658 

30 

197.30921 

43 

403.53804 

150 

4888.38812 

5 

6.54265 

18 

72.01140 

31 

210.56940 

44 

422.43983 

200 

8688.58977 

6 

9.00586 

19 

80.06919 

32 

224.26344 

45 

441.77568 

250 

13574.49960 

7 

11.89285 

20 

88.56010 

33 

238.39135 

46 

461.54561 

300 

19546.12790 

8 

15.20614 

21 

97.48422 

34 

252.95315 

47 

481.74964 

350 

26603.48018 

9 

18.94746 

22 

106.84167 

35 

267.94888 

48 

502.38776 

400 

34746.55970 

10 

23.11805 

23 

116.63253 

36 

283.37855 

49 

523.45999 

450 

43975.36860 

11 

27.71882 

24 

126.85686 

37 

299.24218 

50 

544.96634 

500 

54289.90830 

12 

32.75044 

25 

137.51475 

38 

315.53979 

60 

783.90743 



13 

38.21345 

26 

148.60624 

39 

332.27139 

70 

1066.26576 




Beware, however, that extreme significance levels are purely theoretical and that one 
seldom or never should trust experimental limits at these levels. In an experimental situa¬ 
tions one rarely fulfills the statistical laws to such detail and any bias or background may 
heavily affect statements on extremely small probabilities. 


Although one normally would use a routine to find the probability content for a normal 
distribution it is sometimes convenient to have a “classical” table available. In table 6 on 
page 176 we give probability contents for a symmetric region from —z to z for z- values 
ranging from 0.00 to 3.99 in steps of 0.01. Conversely we give in table 7 on page 177 the 
£-values corresponding to specific probability contents from 0.000 to 0.998 in steps of 0.002. 
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34.8 Random Number Generation 

There are many different methods to obtain random numbers from a normal distribution 
some of which are reviewed below. It is enough to consider the case of a standard normal 
distribution since given such a random number z we may easily obtain one from a general 
normal distribution by making the transformation x — /! + az. 

Below f(x) denotes the standard normal distribution and if not explicitly stated all 
variables denoted by £ are uniform random numbers in the range from zero to one. 

34.8.1 Central Limit Theory Approach 

The sum of n independent random numbers from a uniform distribution between zero and 
one, R n , has expectation value E(R n ) = nj 2 and variance V(R n ) = nj 12. By the central 
limit theorem the quantity 

_ Rn - E(R n ) _ Rn - f 

/w “ £5 

approaches the standard normal distribution as n —> oo. A practical choice is n = 12 since 
this expression simplifies to Z\ 2 = R\ 2 — 6 which could be taken as a random number from 
a standard normal distribution. Note, however, that this method is neither accurate nor 
fast. 

34.8.2 Exact Transformation 

The Box-Muller transformation used to find random numbers from the binormal distribu¬ 
tion (see section 6.5 on page 22), using two uniform random numbers between zero and 
one in £1 and £ 2 , 

z 1 = \J — 2 ln^x sin 27 t£ 2 

£2 = a/-2 ln£, cos 27r£ 2 

may be used to obtain two independent random numbers from a standard normal distri¬ 
bution. 

34.8.3 Polar Method 

The above method may be altered in order to avoid the cosine and sine by 

i Generate u and v as two uniformly distributed random numbers in the range from -1 
to 1 by u = 2£i — 1 and v = 2£ 2 — 1. 

ii Calculate w = u 2 + v 2 and if w > 1 then go back to i. 

iii Return x = uz and y = vz with 2 = \J—2 In w/w as two independent random numbers 
from a standard normal distribution. 

This method is often faster than the previous since it eliminates the sine and cosine 
at the slight expense of 1 — 7 t/4 ps 21% rejection in step iii and a few more arithmetic 
operations. As is easily seen u/^/w and v/^/w plays the role of the cosine and the sine in 
the previous method. 
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34.8.4 Trapezoidal Method 

The maximum trapezoid that may be inscribed under the standard normal curve covers an 
area of 91.95% of the total area. Random numbers from a trapezoid is easily obtained by a 
linear combination of two uniform random numbers. In the remaining cases a tail-technique 
and accept-reject techniques, as described in figure 26, are used. 





Figure 26: Trapezoidal method 

Below we describe, in some detail, a slightly modified version of what is presented in 
[28]. For more exact values of the constants used see this reference. 

i Generate two uniform random numbers between zero and one £ and £o 

ii If £ < 0.9195 generate a random number from the trapezoid by x = 2.404£ o + 1.984£ — 
2.114 and exit 

iii Else if £ < 0.9541 (3.45% of all cases) generate a random number from the tail x > 2.114 

a Generate two uniform random numbers £i and £2 
b Put x = 2.114 2 — 2 ln£i and if > 2.114 2 then go back to a 
c Put x = y/x and go to vii 

iv Else if £ < 0.9782 (2.41% of all cases) generate a random number from the region 
0.290 < x < 1.840 between the normal curve and the trapezoid 

a Generate two uniform random numbers £1 and £2 

b Put x = 0.290 + 1.55l£i and if f(x) — 0.443 + 0.210a; < 0.016£ 2 then go to a 
c Go to vii 

v Else if £ < 0.9937 ( 1.55% of all cases) generate a random number from the region 
1.840 < £ < 2.114 between the normal curve and the trapezoid 

a Generate two uniform random numbers £1 and £2 

b Put x = 1.840 + 0.274£i and if f(x) — 0.443 + 0.210a; < 0.043£ 2 then go to a 
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c Go to vii 


vi Else, in 0.63% of all cases, generate a random number from the region 0 < x < 0.290 
between the normal curve and the trapezoid by 

a Generate two uniform random numbers £1 and £2 
b Put x = 0.290£i and if f(x) — 0.383 < 0.016^2 then go back to a 

vii Assign a minus sign to x if £0 > \ 

34.8.5 Center-tail method 

Ahrens and Dieter [28] also proposes a so called center-tail method. In their article they 
treat the tails outside \z\ > \/2 with a special tail method which avoids the logarithm. 
However, it turns out that using the same tail method as in the previous method is even 
faster. The method is as follows: 

i Generate a uniform random number £ and use the first bit after the decimal point as 
a sign bit s i.e. for £ < | put £ = 2£ and s = — 1 and for £ > | put £ = 2£ — 1 and 
s = 1 

ii If £ > 0.842700792949715 (the area for — y/2 < z < \/2) go to vi. 

iii Center method: Generate £0 and set v = £ + 0 

iv Generate £1 and £2 and set u* = moi(£i, £ 2 ). 

If v < v* calculate y = £ 0 ^ and go to viii 

v Generate £1 and £2 and set v = mai(£ 1 , £ 2 ) 

If v < v* go to iv else go to iii 

vi Tail method: Generate £1 and set y = 1 — ln£i 

vii Generate £ 2 and if y£| > 1 go to vi else put y = yjy 

viii Set x = syV2. 

34.8.6 Composition-rejection Methods 

In reference [21] two methods using the composition-rejection method is proposed. The 
Erst one, attributed to Butcher [23] and Kahn, uses only one term in the sum and has 
a = y/2e/7T, fix) = exp{— x} and g{x) = exp {—[x — l) 2 /2}. The algorithm is as follows: 

i Generate £1 and £2 

ii Determine x = — ln£i, i.e. a random number from f{x) 

iii Determine g[x) = exp{ — [x — l) 2 /2} 

iv If £2 > g(x) then go to i 
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v Decide a sign either by generating a new random number, or by using £ 2 for which 
0 < £2 < g{x) here, and exit with x with this sign. 

The second method is originally proposed by J. C. Butcher [23] and uses two terms 

oq = J 2 fi( x ) — 1 9i( x ) — e ~^ for 0 < x < 1 

a 2 = l/v^vr / 2 (x) = 2e~ 2( - x ~^ g 2 (x) = e _< 2 } for x > 1 

i Generate £1 and £ 2 

ii If £1 — | > 0 then determine x — 1 — \ ln(3£i — 2) and z — \(x — 2 ) 2 else determine 
x = 3£i/2 and z = x 2 /2 

iii Determine g = e~ z 

iv If £ 2 > g the go to i 

v Determine the sign of £ 2 — g/2 and exit with x with this sign. 

34.8.7 Method by Marsaglia 

A nice method proposed by G. Marsaglia is based on inscribing a spline function beneath the 
standard normal curve and subsequently a triangular distribution beneath the remaining 
difference. See figure 27 for a graphical presentation of the method. The algorithm used is 
described below. 



Figure 27: Marsaglia method 


The sum of three uniform random numbers £ 1 , £ 2 , and £3 follow a parabolic spline 
function. Using x = 2(£i + £2 + £3 — §) we obtain a distribution 


fi(z) = 


f (3 - x 2 )/8 

if 

M ^ 1 

(3 - M) 2 /16 

if 

1 < \x < 3 

lo 

if 

\x\ > 3 


Maximizing aq with the constraint f(x) — a 1/1 (A) > 0 in the full interval |x| < 3 
gives aq = 16e~ 2 /\/27r ~ 0.8638554 i.e. in about 86% of all cases such a combination 
is made. 
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• Moreover, a triangular distribution given by making the combination x = §(£i.-l-£ 2 —1) 
leading to a function 

h{x) = | (| - M) f or [x| < | 

and zero elsewhere. This function may be inscribed under the remaining curve 
fix) — fi(x) maximizing a 2 such that fs(x) = fix) — ot\f\ix) — 0 : 2 / 2 (^c) > 0 in 
the interval |x| < |. This leads to a value o 2 ~ 0.1108 i.e. in about 11% of all cases 
this combination is used 

• The maximum value of / 3 (x) in the region |x| < 3 is 0.0081 and here we use a 
straightforward reject-accept technique. This is done in about 2.26% of all cases. 

• Finally, the tails outside |x| > 3, covering about 0.27% of the total area is dealt with 
with a standard tail-method where 

a Put x = 9 — 2 In 
b If xfl > 9 then go to a 

c Else generate a sign s = +lors = —1 with equal probability and exit with 
x = Si/x 

34.8.8 Histogram Technique 

Yet another method due to G. Marsaglia and collaborators [37] is one where a histogram 
with k bins and bin-width c is inscribed under the (folded) normal curve. The difference 
between the normal curve and the histogram is treated with a combination of triangular 
distributions and accept-reject techniques as well as the usual technique for the tails. Trying 
to optimize fast generation we found k — 9 and c = | to be a fair choice. This may, however, 
not be true on all computers. See figure 28 for a graphical presentation of the method. 





Figure 28: Histogram method 
The algorithm is as follows: 

i Generate £1 and chose which region to generate from. This is done e.g. with a 
sequential search in a cumulative vector where the areas of the regions have been 
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sorted in descending order. The number of elements in this vector is 2k + 4 +1 which 
for the parameters mentioned above becomes 22 . 

ii If a histogram bin i (i — 1, 2,..., k) is selected then determine x = (£ 2 + * — l)c and 
go to vii. 

iii If an inscribed triangle i (i — 1 , 2 ,...,-) then determine x = (min (£ 2 , £ 3 ) + 1 — i)c 
and go to vii. 

iv If subscribed triangle i (i — ^ +1,..., k) then determine x = (rmn(£ 2 , £ 3 ) +i — l)c and 
accept this value with a probability equal to the ratio between the normal curve and 
the triangle at this x-value (histogram subtracted in both cases) else iterate. When 
a value is accepted then go to vii. 

v For the remaining - regions between the inscribed triangles and the normal curve for 
x < 1 use a standard reject accept method in each bin and then go to vii. 

vi If the tail region is selected then use a standard technique e.g. (a) x = ( kc ) 2 — 2 ln£ 2 , 
(b) if > (kc) 2 then go to a else use x = \Jx. 

vii Attach a random sign to x and exit. This is done by either generating a new uniform 
random number or by saving the first bit of £1 in step i. The latter is faster and the 
degradation in precision is negligible. 


34.8.9 Ratio of Uniform Deviates 


A technique using the ratio of two uniform deviates was propose by A. J. Kinderman and 
J. F. Monahan in 1977 [38]. It is based on selecting an acceptance region such that the ratio 
of two uniform pseudorandom numbers follow the standard normal distribution. With u 
and v uniform random numbers, u between 0 and 1 and v between — J 2 /e and 2 /e, such 
a region is defined by 


v 2 < —4 u 2 In u 


as is shown in the left-hand side of figure 29. 

Note that it is enough to consider the upper part (v > 0) of the acceptance limit due to 
the symmetry of the problem. In order to avoid taking the logarithm, which may slow the 
algorithm down, simpler boundary curves were designed. An improvement to the original 
proposal was made by Joseph L. Leva in 1992 [39,40] choosing the same quadratic form for 
both the lower and the upper boundary namely 


Q(u, v ) — (u — s ) 2 — b(u — s)(v — t) + (a — v) 2 


Here (s,t) = (0.449871,-0.386595) is the center of the ellipses and a = 0.196 and b = 0.25472 
are suitable constants to obtain tight boundaries. I 11 the right-hand side of figure 29 we 
show the value of the quadratic form at the acceptance limit Q(u, 2 uy/— In u) as a function 
of u. It may be deduced that only in the interval r\ < Q < r 2 with rq = 0.27597 and 
r -2 = 0.27846 we still have to evaluate the logarithm. 


The algorithm is as follows: 
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0.279 

0.278 

0.277 

0.276 

0.275 



0 1 


Q(u,v) vs u at limit 


Figure 29: Method using ratio between two uniform deviates 

i Generate uniform random numbers u = £i and v = 2 ^2/e (£2 — |). 
ii Evaluate the quadratic form Q = x 2 + y(ay — bx ) with x = u — s and y = \v\ — t. 

iii Accept if inside inner boundary, i.e. if Q < r 1; then go to vi. 

iv Reject if outside upper boundary, i.e. if Q > 72 , then go to i. 

v Reject if outside acceptance region, i.e. if v 2 > —4m 2 In u, then go to i. 

vi Return the ratio v/u as a pseudorandom number from a standard normal distribution. 

On average 2.738 uniform random numbers are consumed and 0.012 logarithms are com¬ 
puted per each standard normal random number obtained by this algorithm. As a com¬ 
parison the number of logarithmic evaluations without cutting on the boundaries, skipping 
steps ii through iv above, would be 1.369. The penalty when using logarithms on modern 
computers is not as severe as it used to be but still some efficiency is gained by using the 
proposed algorithm. 

34.8.10 Comparison of random number generators 

Above we described several methods to achieve pseudorandom numbers from a standard 
normal distribution. Which one is the most efficient may vary depending on the actual 
implementation and the computer it is used at. To give a rough idea we found the following 
times per random number 8 (in the table are also given the average number of uniform 

“The timing was done on a Digital Personal Workstation 433au workstation running Unix version 4.0D 
and all methods were programmed in standard Fortran as functions giving one random number at each 
call. 
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pseudorandom numbers consumed per random number in our implementations) 


Method 

section 

/is/r.n. 

A^/r.n. 

comment 

Trapezoidal method 

34.8.4 

0.39 

2.246 


Polar method 

34.8.3 

0.41 

1.273 

pair 

Histogram method 

34.8.8 

0.42 

2.121 


Box-Mullcr transformation 

34.8.2 

0.44 

1.000 

pair 

Spline functions 

34.8.7 

0.46 

3.055 


Ratio of two uniform deviates 

34.8.9 

0.55 

2.738 


Composition-rejection, two terms 

34.8.6 

0.68 

2.394 


Center-tail method 

34.8.5 

0.88 

5.844 


Composition-rejection, one term 

34.8.6 

0.90 

2.631 


Central limit method approach 

34.8.1 

1.16 

12.000 

inaccurate 


The trapezoidal method is thus fastest but the difference is not great as compared to 
some of the others. The central limit theorem method is slow as well as inaccurate although 
it might be the easiest to remember. The other methods are all exact except for possible 
numerical problems. ’’Pair” indicates that these generators give two random numbers at a 
time which may implies that either one is not used or one is left pending for the next call 
(as is the case in our implementations). 
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34.9 Tests on Parameters of a Normal Distribution 


For observations from a normal sample different statistical distributions are applicable in 
different situations when it comes to estimating one or both of the parameters /i and er. In 
the table below we try to summarize this in a condensed form. 


TESTS OF MEAN AND VARIANCE OF NORMAL DISTRIBUTION 


Hi 


o 


Condition 


Statistic 


Distribution 


H = do 


2 2 

U" = G<\ 


Hi = H2 = H 


2 2 

° 1 = ^2 


er" known 
er" unknown 

/d known 

Id unknown 
o\ = °2 = known 
A °2 known 


= a 2 = 0-2 unknown 


o\ A o\ unknown 


Hi A H 2 known 


Hi A H2 unknown 


x-yo 
cr / y/n 

x-y 0 

s/y/n 

(n-l)s 2 _ A ( x i~y) 2 

^2 — 2-j V2 


i =1 


(n—l)i 


= E 


( Xj-x ) 2 


i=l CT o 




CT \/I+A 

x—y 


2 2 
<77 cr~ 

_L j— 2 . 


n m 

x-y 

S Vn + m 

_ (n— l)s 2 +(m—l)s| 
n+TO—2 

x-y 


s 2 s 2 
fl i f 2 


AiE0n-/n) 2 


„2 — 


2 ArEfe-^) 2 


i=l 


s 2 

5 1 


At EOn-O 2 


At E (yi-y ) 2 

i =1 


JV(0,1) 


^n —1 


Xn 


Xn-l 


JV(0, 1) 
N( 0.1) 


t 


n+m—2 


AT(0,1) 


F, 


n,m 


Hn— l,m—1 
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35 Pareto Distribution 

35.1 Introduction 

The Pareto distribution is given by 

f(x; a, k) = ak a /x a+1 

where the variable x > k and the parameter a > 0 are real numbers. As is seen k is only 
a scale factor. 

The distribution has its name after its inventor the italian Vilfrcdo Pareto (1848-1923) 
who worked in the fields of national economy and sociology (professor in Lausanne, Switzer¬ 
land). It was introduced in order to explain the distribution of wages in society. 


35.2 Cumulative Distribution 

The cumulative distribution is given by 

F(x) = j f(u)du = 1 - (^j 


35.3 Moments 

Algebraic moments are given by 

ak a °° ak a 
x a ~ n+1 k oc — n 

which is defined for a > n. 

Especially the expectation value and variance are given by 


E(x n ) = J x n f(x ) = J 


ak c 


x 


x‘ 


•a+1 


E{x) 

V(x) 


ak 
a — l 


for a > 1 


ak 2 

(a — 2)(a — l) 2 


for 


a > 2 


35.4 Random Numbers 


To obtain a random number from a Pareto distribution we use the straightforward way of 
solving the equation F(x) = £ with £ a random number uniformly distributed between zero 
and one. This gives 


F(x) 



x = 


k 

(i — 0“ 
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36 Poisson Distribution 

36.1 Introduction 

The Poisson distribution is given by 


P(r-,v) 


H r e~» 


T\ 


where the variable r is an integer (r > 0 ) and the parameter g is a real positive quantity. 
It is named after the french mathematician Simeon Denis Poisson (1781-1840) who was 
the first to present this distribution in 1837 (implicitly the distribution was known already 
in the beginning of the 18th century). 

As is easily seen by comparing two subsequent r-values the distribution increases up to 
r + 1 < fj, and then declines to zero. For low values of fi it is very skewed (for /i < 1 it is 
J-shaped). 

The Poisson distribution describes the probability to find exactly r events in a given 
length of time if the events occur independently at a constant rate /i. An unbiased and 
efficient estimator of the Poisson parameter // for a sample with n observations 27 is fi — x, 
the sample mean, with variance V(fi) = [i/n. 

For /i —> oo the distribution tends to a normal distribution with mean fi and variance 

fi. 

The Poisson distribution is one of the most important distributions in statistics with 
many applications. Along with the properties of the distribution we give a few examples 
here but for a more thorough description we refer to standard text-books. 


36.2 Moments 

The expectation value, variance, third and fourth central moments of the Poisson distribu¬ 
tion are 


E(r ) = n 
V{r) = n 
Iki = d 
/i 4 — /i( 1 T 3/i) 

The coefficients of skewness and kurtosis are 71 = 1 /y/JI and 72 = \/\x respectively, i.e. 
they tend to zero as /i —> 00 in accordance with the distribution becoming approximately 
normally distributed for large values of /i. 

Algebraic moments may be found by the recursive formula 

v {*4 + } 

and central moments by a similar formula 

Hk+i = l-i { kuk-i + 77^4 
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For a Poisson distribution one may note that factorial moments g k (cf page 6) and 
cumulants K k (see section 2.5) become especially simple 

g k = E(r{r - 1) • • ■ (r - k + 1)) = g k 
K r = g for all r > 1 


36.3 Probability Generating Function 

The probability generating function is given by 

G(z) = E(z r ) = £ z r ^- = e-* £ 

r =0 r ‘ r =0 r ‘ 

Although we mostly use the probability generating function in the case of a discrete distri¬ 
bution we may also define the characteristic function 


cxj r 

4>{t) = E(e' tr ) = e-" E e ltA - = exp {p (e Ji - l) } 


r=0 


a result which could have been given directly since </>(f) = G(e l 


36.4 Cumulative Distribution 


When calculating the probability content of a Poisson distribution we need the cumulative, 
or distribution, function. This is easily obtained by finding the individual probabilities e.g. 
by the recursive formula p(r) = p{r — 1)- starting with p(0) = e -/ h 

There is, however, also an interesting connection to the incomplete Gamma function 

[ 10 ] 

P(r) = ±^ = l-P(r + l,g) 

k =0 

with P(a,x) the incomplete Gamma function not to be confused with P(r). 

Since the cumulative chi-square distribution also has a relation to the incomplete 
Gamma function one may obtain a relation between these cumulative distributions namely 


r 


P(r) = £ 

fc =0 


g k e » 
k\ 


2 fi 

1 — J f(x; v = 2r + 2 )dx 
o 


where f(x] v = 2r + 2) denotes the chi-square distribution with v degrees of freedom. 


36.5 Addition Theorem 

The so called addition theorem states that the sum of any number of independent Poisson- 
distributed variables is also distributed according to a Poisson distribution. 

For n variables each distributed according to the Poisson distribution with parameters 
(means) /i t we find characteristic function 

n ( n 

0(n + r 2 + ... + r n ) = n exp jp* ( e lt - l) } = exp < E IE ( e ^ _ x ) 

i =1 U=1 

which is the characteristic function for a Poisson variable with parameter g — J2 Hi- 
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36.6 Derivation of the Poisson Distribution 


For a binomial distribution the rate of “success” p may be very small but in a long series of 
trials the total number of successes may still be a considerable number. In the limit p —> 0 
and N —» oo but with Np = p a finite constant we find 


p(r) 


( N V(1 -P) N - r « 1 , _ (E) r ( x _ E) 

\ r ) r! ^27r(iV - r)(N - r )N-r e -(N-r) \NJ V NJ 

1 I N 1 _ r r (, p\ N ~ r p r e-» 

r! V N — r ^ _ rpj N& ^ V NJ r! 


as N —> oo and where we have used that lim n _^ 00 (l — |) n = e x and Stirling’s formula (se 
section 42.2) for the factorial of a large number n\ ~ a/27t n n n e~ n . 

It was this approximation to the binomial distribution which S. D. Poisson presented 
in his book in 1837. 


36.7 Histogram 

In a histogram of events we would regard the distribution of the bin contents as multi- 
nomially distributed if the total number of events N were regarded as a fixed number. 
If, however, we would regard the total number of events not as fixed but as distributed 
according to a Poisson distribution with mean v we obtain (with k bins in the histogram 
and the multinomial probabilities for each bin in the vector p) 

Given a multinomial distribution, denoted M(r; N,p), for the distribution of events into 
bins for fixed N and a Poisson distribution, denoted P(N] v), for the distribution of N we 
write the joint distribution 

/AT! \ ( y N e~ v \ 

V(r_, N) = M( r; N,p)P(N ; „) = •' -P? j (^p) = 

= (-l ( „p 2 )-e-™) ... 

where we have used that 

k k 

J2Pi = 1 a nd = N 

1=1 2=1 

i. e. we get a product of independent Poisson distributions with means upi for each individual 
bin. A simpler case leading to the same result would be the classification into only two 
groups using a binomial and a Poisson distribution. 

The assumption of independent Poisson distributions for the number events in each bin 
is behind the usual rule of using \/N as the standard deviation in a bin with N entries and 
neglecting correlations between bins in a histogram. 
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36.8 Random Number Generation 


By use of the cumulative technique e.g. forming the cumulative distribution by starting 
with P(0) = e _/i and using the recursive formula 

P(r) — P(r — 1) — 
r 

a random number from a Poisson distribution is easily obtained using one uniform random 
number between zero and one. If p is a constant the by far fastest generation is obtained 
if the cumulative vector is prepared once for all. 

An alternative is to obtain, in p, a random number from a Poisson distribution by 
multiplying independent uniform random numbers until 

rp.Se-" 

*=0 

For large values of p use the normal approximation but beware of the fact that the 
Poisson distribution is a function in a discrete variable. 
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37 


Rayleigh Distribution 


37.1 Introduction 


The Rayleigh distribution is given by 


f(x]a) 


— e 2 a - 
or 


for real positive values of the variable x and a real positive parameter a. It is named after 
the british physicist Lord Rayleigh (1842-1919), also known as Baron John William Strutt 
Rayleigh of Terling Place and Nobel prize winner in physics 1904. 

Note that the parameter a is simply a scale factor and that the variable y = x/a has 
the simplified distribution g(y) = ye~ v ^ 2 . 



Figure 30: The Rayleigh distribution 

The distribution, shown in figure 30, has a mode at x = a and is positively skewed. 

37.2 Moments 

Algebraic moments are given by 

OO OO 

E[x n ) = / x n f(x)dx = — f \x\ n+1 e~ x2/2a2 

0 —oo 

i.e. we have a connection to the absolute moments of the Gauss distribution. Using these 
(see section 34 on the normal distribution) the result is 

E(x n } — \ Vf n!!a;n f° r n °dd 

1 2 k k\a 2k for n = 2k 
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Specifically we note that the expectation value, variance, and the third and fourth 
central moments are given by 


E(x) = V ( x ) = « 2 ( 2 - 0 , h3 = 0 ! 3 (tt - 3 )and = a 4 ^8 - 


3tt 2 ' 


The coefficients of skewness and kurtosis is thus 

(*- 3)4 


7i = 


3tt 2 


( 2 -f) 


0.63111 and 72 = 




-4-^ - 3 « 0.24509 


37.3 Cumulative Distribution 


The cumulative distribution, or the distribution function, is given by 


F(x) = j f(y)dy 
0 



1 - e" 


2 

where we have made the substitution z = j -2 in order to simplify the integration. As it 
should we see that F(0) = 0 and F( 00) = 1. 

Using this we may estimate the median M. by 

F{M) = - =4> M = aV2 In2 « 1.17741a 

and the lower and upper quartiles becomes 

Q 1 = ol\J-2 In | « 0.75853a and Q 3 = aV2 In 4 « 1.66511a 

and the same technique is useful when generating random numbers from the Rayleigh 
distribution as is described below. 


37.4 Two-dimensional Kinetic Theory 


Given two independent coordinates x and y from normal distributions with zero mean and 
the same variance a 2 the distance z = \Jx 2 + y 2 is distributed according to the Rayleigh 
distribution. The x and y may e.g. be regarded as the velocity components of a particle 
moving in a plane. 

To realize this we first write 


z 2 x 2 y 2 
w = —— = —— + —— 


<7 




a* 


Since xjo and y/a are distributed as standard normal variables the sum of their squares 
has the chi-squared distribution with 2 degrees of freedom i.e. g(w ) = e~ w F /2 from which 
we find 


f(z) = g(w ) 


dw 

dz 



_ z 2 
2<t 2 




which we recognize as the Rayleigh distribution. This may be compared to the three- 
dimensional case where we end up with the Maxwell distribution. 
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37.5 Random Number Generation 

To obtain random numbers from the Rayleigh distribution in an efficient way we make the 
transformation y = x 2 /2a 2 a variable which follow the exponential distribution g(y) = e~ y . 
A random number from this distribution is easily obtained by taking minus the natural 
logarithm of a uniform random number. We may thus fold a random number r from a 
Rayleigh distribution by the expression 


r = oi\j—2 ln£ 

where £ is a random number uniformly distributed between zero and one. 

This could have been found at once using the cumulative distribution putting 

F(x) — £ =>• 1 —e - 2 ^-=£ =>- x = a\J— 21n(l — £) 

a result which is identical since if £ is uniformly distributed between zero and one so is 

i - £• 

Following the examples given above we may also have used two independent random 
numbers from a standard normal distribution, z\ and z 2 , and construct 


r = -\z 2 1+ z 2 


a ' 

However, this technique is not as efficient as the one outlined above. 
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38 Student’s ^-distribution 


38.1 Introduction 


The Student’s i-distribution is given by 







where the parameter n is a positive integer and the variable t is a real number. The 
functions T and B are the usual Gamma and Beta functions. In figure 31 we show the 
t-distribution for n values of 1 (lowest maxima), 2, 5 and oo (fully drawn and identical to 
the standard normal distribution). 



x 


Figure 31: Graph of t-distribution for some values of n 


If we change variable to x — t/yjn and put m = 
becomes 


— the Student’s t-distribution 


f(x]m) = 


k 


(l + x 2 ) r 


with k = 


T(m) 


r (I) r (m 


B{b 


m 


where k is simply a normalization constant and m is a positive half-integer. 


38.2 History 

A brief history behind this distribution and its name is the following. William Sealy Gosset 
(1876-1937) had a degree in mathematics and chemistry from Oxford when he in 1899 began 
working for Messrs. Guinness brewery in Dublin. In his work at the brewery he developed 
a small-sample theory of statistics which he needed in making small-scale experiments. 
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Due to company policy it was forbidden for employees to publish scientific papers and his 
work on the f-ratio was published under the pseudonym “Student”. It is a very important 
contribution to statistical theory. 


38.3 Moments 


The Student’s t-distribution is symmetrical around t — 0 and thus all odd central moments 
vanish. In calculating even moments (note that algebraic and central moments are equal) 
we make use of the somewhat simpler f(x] m ) form given above with x = which implies 
the following relation between expectation values E(t 2r ) = n r E{x 2r ). Central moments of 
even order are given by, with r an integer > 0, 


LX*X 2t* VX— 

V 2 r(x) = J f(x; m)dx = k j * m dx = 2k f 


x 


2 r 


(1 + X 2 ) 1 


-dx 


If we make the substitution y = implying = 1 — y and x = then dy = 

(i+x 2 y 2 dx and we obtain 


H2r{X) 


= 2k 


x 


2 r 


(1 + x 2 ) 2 
2x 


(i + x 2 y 


2i —1 


dv=k j lih^ dy= 


1 / I -\ 2r ~ 1 1 

= k J (1 - v)™-* U y?-) dv = kj(l-y) m ^-ltf-*dy 

o \\ y / o 

1 1 B(r + hm - r - \) 

= kB(r + -,m-r--)= B ^ m _ y 


The normalization constant k was given above and we may now verify this expression by 
looking at /xo = 1 giving k = 1 |) and thus finally, including the n r factor giving 

moments in t we have 


H2r(t) = n r /J,2r{x ) = U‘ 


,B{r + \,m - r - \) r B{r + \,^-r) 


B{\, m — \) 


= n 




As can be seen from this expression we get into problems for 2r > n and indeed those 
moments are undefined or divergent 9 . The formula is thus valid only for 2r < n. A recursive 
formula to obtain even algebraic moments of the t -distribution is 

r -I 

A i 2r = h'2r—2 ' n ' n _ !. 

2 ' 


starting with y' 0 = 1. 

Especially we note that, when n is big enough so that these moments are defined, 
the second central moment (he. the variance) is /i 2 = V(t) = r \ 2 and the fourth central 

moment is given by /i 4 = ^ n _^ n _ ^ . The coefficients of skewness and kurtosis are given by 
7i = 0 and y 2 = 6 4 , respectively. 


9 See e.g. the discussion in the description of the moments for the Cauchy distribution which is the 
special case where to = n = 1. 
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38.4 Cumulative Function 


In calculating the cumulative function for the f-distribution it turns out to be simplifying 
to first estimate the integral for a symmetric region 



where we have made the substitution x = n/(n + u 2 ) in order to simplify the integration. 
From this we find the cumulative function as 


m 



for — oo < x < 0 
for 0 < x < oo 


38.5 Relations to Other Distributions 

The distribution in F = t 2 is given by 


f(F) 


dt 


dF 


m 


j_ (i + £T 2 

2 VF |) 


n^F 


B{H){F + n) 


n+1 

2 


which we recognize as a F-distribution with 1 and n degrees of freedom. 

As n oo the Student’s t-distribution approaches the standard normal distribution. 
However, a better approximation than to create a simplcminded standardized variable, 
dividing by the square root of the variance, is to use 



which is more closely distributed according to the standard normal distribution. 
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38.6 i-ratio 


Regard t = \J\ where x and y are independent variables distributed according to the 

standard normal and the chi-square distribution with n degrees of freedom, respectively. 
The independence implies that the joint probability function in x and y is given by 


f(x,y,n) 




where — oo < x < oo and y > 0. If we change variables to t = and u = y the 

distribution in t and u, with — oo < t < oo and u > 0, becomes 


fit, u] n) 


d(x,y) 
d(t, u ) 


f(x,y,n ) 


The determinant is and thus we have 


fit, y n) = 



u \( n + 1)—i e 2 


( 5 ) 




2^ 


Finally, since we are interested in the marginal distribution in t we integrate over u 


fit ; n) 


oo 

J f(t,w,n)du 

o 



re+l 

U 2 


- 1 , 


:M) 


du = 



where we made the substitution v — f (l + y;) hi order to simplify the integral which in 
the last step is recognized as being equal to T ( 2 ^)- 


38.7 One Normal Sample 

Regard a sample from a normal population N(n, cr 2 ) where the mean value x is distributed 
as N(n,—) and ^ n ~ s is distributed according to the chi-square distribution with n — 1 

n 

degrees of freedom. Here s 2 is the usual unbiased variance estimator s 2 = —j- J2 i x i — x) 2 

i= 1 

which in the case of a normal distribution is independent of x. This implies that 

X—[X _ 

_ a/s/n _ X /I 

” v /fc ^ 1£ /(™- 1 ) ~ 

is distributed according to Student’s t-distribution with n — 1 degrees of freedom. We may 
thus use Student’s t-distribution to test the hypothesis that x — /j (see below). 
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38.8 Two Normal Samples 


Regard two samples {x\,X2 and {yi, 1/2, ..., y n } from normal distributions having 
the same variance er 2 but possibly different means y x and y y , respectively. Then the 
quantity (x — y) — (/i x — fi y ) has a normal distribution with zero mean and variance equal 
to a 2 (“ + “)• Furthermore the pooled variance estimate 


2 = ( m ~ x )4 + {n ~ 1 )s 2 y 
m + n — 2 


E (xi 


i— 1 


-xf+t^-y ) 2 

1=1 

m + n — 2 


is a normal theory estimate of a 2 with m + n — 2 degrees of freedom 10 . 
Since s 2 is independent of x for normal populations the variable 


t = (x-y)~ (fi x - Hy) 

sJ — + - 

V m n 

has the f-distribution with m + n — 2 degrees of freedom. We may thus use Student’s 
f-distribution to test the hypotheses that x — y is consistent with 5 = y x — y y . In particular 
we may test if 8 — 0 i.e. if the two samples originate from population having the same 
means as well as variances. 


38.9 Paired Data 

If observations are made in pairs (. x^yi ) for i — 1,2, ...,n the appropriate test statistic is 


d d d 


Srf Sd/Vn 

n 

Y.i.di-d) 2 

\ 

n(n-l) 


where d{ = Xi — yi and d = x — y. This quantity has a t-distribution with n — 1 degrees of 
freedom. We may also write this f-ratio as 

\Jn ■ d 

\J s l + s l ~ 2C xy 

where s 2 and s 2 are the estimated variances of x and y and C xy is the covariance between 
them. If we would not pair the data the covariance term would be zero but the number of 
degrees of freedom 2n — 2 i.e. twice as large. The smaller number of degrees of freedom in 
the paired case is, however, often compensated for by the inclusion of the covariance. 

38.10 Confidence Levels 

In determining confidence levels or testing hypotheses using the t-distribution we define 
the quantity t a , n from 

F(t a ,n) = f f(t; n)dt = 1 - a 


10 If y is a normal theory estimate of a 2 with k degrees of freedom then ky/c r 2 is distributed according 
to the chi-square distribution with k degrees of freedom. 
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i.e. a is the probability that a variable distributed according to the f-distribution with 
n degrees of freedom exceeds t a , n - Note that due to the symmetry about zero of the 
f-distribution t a>n = — t\- a>n . 

In the case of one normal sample described above we may set a 1 — a confidence interval 
for fi 

s s 

l = ta/2,n—l — 1^ — % T T=ta/2,n— 1 
y/n \/Tl 

Note that in the case where a 2 is known we would not use the f-distribution. The 
appropriate distribution to use in order to set confidence levels in this case would be the 
normal distribution. 

38.11 Testing Hypotheses 

As indicated above we may use the f-statistics in order to test hypotheses regarding the 
means of populations from normal distributions. 

In the case of one sample the null hypotheses would be Hq\ p = p o and the alternative 
hypothesis Hp. p ^ p 0 . We would then use f = as outlined above and reject H 0 at 
the a confidence level of significance if |f| > A=f Q /2, n -i- This test is two-tailed since we 
do not assume any a priori knowledge of in which direction an eventual difference would 
be. If the alternate hypothesis would be e.g. Hi : p > p 0 then a one-tailed test would be 
appropriate. 

The probability to reject the hypothesis H 0 if it is indeed true is thus a. This is a so 
called Type I error. However, we might also be interested in the probability of committing 
a Type II error implying that we would accept the hypothesis although it was wrong and 
the distribution instead had a mean In addressing this question the f-distribution 
could be modified yielding the non-central f-distribution. The probability content (3 of 
this distribution in the confidence interval used would then be the probability of wrongly 
accepting the hypothesis. This calculation would depend on the choice of a as well as on 
the number of observations n. However, we do not describe details about this here. 

In the two sample case we may want to test the null hypothesis H 0 : p x = p y as 
compared to H\ : p x ^ p y . Once again we would reject H 0 if the absolute value of the 

quantity t = (x - y)/s + I would exceed f a / 2 ,n+m-2- 

38.12 Calculation of Probability Content 

In order to find confidence intervals or to test hypotheses we must be able to calculate 
integrals of the probability density function over certain regions. We recall the formula 

^ cx,n 

F(t a , n ) = j f(t ; n)dt = 1- a 

— OO 

which defines the quantity f Q n for a specified confidence level a. The probability to get a 
value equal to f Qin or higher is thus a. 

Classically all text-books in statistics are equipped with tables giving values of f Qjn for 
specific a-values. This is sometimes useful and in table 8 on page 178 we show such a 
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table giving points where the distribution has a cumulative probability content of 1 — a for 
different number of degrees of freedom. 

However, it is often preferable to calculate directly the exact probability that one would 
observe the actual f-value or worse. To calculate the integral on the left-hand side we differ 
between the case where the number of degrees of freedom is an odd or even integer. The 
equation above may either be adjusted such that a required a is obtained or we may 
replace f a>n with the actual f-value found in order to calculate the probability for the 
present outcome of the experiment. 

The algorithm proposed for calculating the probability content of the f-distribution is 
described in the following subsections. 


38.12.1 Even number of degrees of freedom 

For even n we have putting m — | and making the substitution x — \ 


1 — a = j f{t ; n)dt 

— OO 



T~\ f I 1 \ tcin/y/n 

r [m + I) Y dx 

V^T M (1 + x 2 ) m+ ^ 


For convenience (or maybe it is rather laziness) we make use of standard integral tables 
where we find the integral 


dx 


(< ax 2 + c) m+ 2 


x 

V ax 2 + c 


m— 1 


E 


2 2 m— 2r— l( m _ 1 )j m! ( 2r )! 

(2m)!(r!) 2 c m_r ( ax 2 + c) r 


where in our case a = c = 1. Introducing x a = t a , n /\fn for convenience this gives 


r (m + ^ [m — l)!m!2 2m 

V / Fr(m)(2m)! 



(2r)! 

2 2r (r!) 2 (1 + x 2 ) r 


+ 


1 

2 


The last term inside the brackets is the value of the integrand at — oo which is seen to equal 
— Looking at the factor outside the brackets using that r(n) = (n — 1)! for n a positive 

integer, Y (m + |) = ^m^ !! \/7r, and rewriting (2m)! = (2m)!!(2m — 1)!! = 2 m m!(2m — 1)!! 
we find that it in fact is equal to one. We thus have 


(2r)! 1 

a 2y / T+~x^ 2 2r (r!) 2 (1 + x 2 ) r 2 

In evaluating the sum it is useful to look at the individual terms. Denoting these by u r we 
find the recurrence relation 

2r(2r - 1) 1 - ^ 

U r = U r -1 • v 0( - 2 T = U r~l ' , , \ 

r- 2-(l + x 2 ) 1 + ^a 

where we start with uq — 1. 
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To summarize: in order to determine the probability a to observe a value t or bigger 
from a f-distribution with an even number of degrees of freedom n we calculate 


m— 1 


a = 


2 V 1 + n r= ° 

where u 0 — 1 and u r = u r -\ • i +t y n ■ 

38.12.2 Odd number of degrees of freedom 

For odd n we have putting m = and making the substitution x = ^ 

r (^yr 1 ) 7 dt r(m + l) L f dx 

(2)^00 (1 + f) 


E U r + 2 


1 - 7 


1 — a = J f(t ; n)dt = 


n+1 

2 


v^r (m + |) (1 + a; 2 ) 


2\m+i 


where we again have introduced x Q = t a ,n/Once again we make use of standard 
integral tables where we find the integral 


f dx (2 m)! 

x Jk, r!(r — 1)! 1 

r dx 

1 (a + bx 2 ) m+1 ~ (ml) 2 

2 a “J (4a) m_r (2r)! (a + bx 2 ) r (4a) m J 

a + hr 2 


where in our case a — b — 1. We obtain 


1 — a = 


r(m + l)( 2 m)! 

v^Fr (m + m! 2 4" 


x. 


™ 4 r r!(r — 1)! vr 

^S (2r)!(l + ^r +arCtanX ° + 2 


where the last term inside the brackets is the value of the integrand at — 00 . The factor 
outside the brackets is equal to ^ which is found using that T(n) = (n — 1)! for n a positive 
integer, T (m+ = {2m ^ n 1 ' >n y/n, and (2m)! = (2m)!!(2m — 1)!! = 2 m (m)!(2m — 1)!!. We 


get 


a = 


x QS -^ 4 r!(r — 1)! vr 

I o Z. To u /1 1 —+ arctan x a + - 

* [ 2 2r 1 1 + x l) 2 


1 
7 r 


ay 


m 2 2 r_ 1 r!(r — 1 )! vr 

n ) 7—777 - 77—7 + arctan x a -]— 

l + xl (2r)\(l + xiy- 1 2 


To compute the sum we denote the terms by v r and find the recurrence relation 


v r = ry_i- 


4r(r — 1) 


= v r - 1 


(f-w) 


2r(2r — 1)(1 + x 2 ) (1 + x 2 a ) 

starting with v\ — 1. 

To summarize: in order to determine the probability a to observe a value t or bigger 
from a f-distribution with an odd number of degrees of freedom n we calculate 


1 

1 — a = — 

7T 


1-7 


t 


n — 1 
2 


—y v + arctan —= 

1 + *- Vn 

n. '— 1 v 


1 

+ 2 


where V\ = 1 and v r = ty_i • 1+ ^ // 
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38.12.3 Final algorithm 


The final algorithm to evaluate the probability content from — oo to t for a f-distribution 
with n degrees of freedom is 

• Calculate x = A= 

yjn 

• For n even: 


o Put m 
o Set u 0 
o For i = 

° « = I 

• For n odd: 


— H 
2 

= 1, s — 0 and i — 0. 
0,1, 2,..., m — 1 set s 



s + Ui, i = i + 1 and Ui = i\^k- 


o Put m = rL rr. 
o Set Vi — 1, s — 0 and i — 1. 

o For z = 1,2, m set s = s + Vi, i = % + 1 and v t = Vi-i ■ 1 ^ 2 1 . 
° « = | - y(i ^2 • s + arctanx). 


38.13 Random Number Generation 

Following the definition we may define a random number t from a t-distribution, using 
random numbers from a normal and a chi-square distribution, as 

z 

t = I - 

\Jyn/n 

where z is a standard normal and y n a chi-squared variable with n degrees of freedom. To 
obtain random numbers from these distributions see the appropriate sections. 
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39 Triangular Distribution 

39.1 Introduction 

The triangular distribution is given by 

r) = ^ + y 

where the variable x is bounded to the interval // — T < x < fi + T and the location and 
scale parameters // and T (T > 0 ) all are real numbers. 

39.2 Moments 

The expectation value of the distribution is E(x) = /i. Due to the symmetry of the 
distribution odd central moments vanishes while even moments are given by 

2 T n 

(n + l)(n + 2) 

for even values of n. In particular the variance V(x) — n 2 — T 2 /6 and the fourth central 
moment /i 4 = T 4 /15. The coefficient of skewness is zero and the coefficient of kurtosis 

72 = - 0 . 6 . 

39.3 Random Number Generation 

The sum of two pseudorandom numbers uniformly distributed between (/i — T)/2 and 
(/i + T)/2 is distributed according to the triangular distribution. If and £ 2 are uniformly 
distributed between zero and one then 

x = v + (6 + 6 - i)r or x = n + (6 - 6)r 

follow the triangular distribution. 

Note that this is a special case of a combination 

x = (a + &)£ 1 + (6 - a )^ 2 - & 

with b > a > 0 which gives a random number from a symmetric trapezoidal distribution 
with vertices at (± 6 , 0 ) and (±a, 
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40 


Uniform Distribution 


40.1 Introduction 

The uniform distribution is, of course, a very simple case with 

f(x\ a, b ) =- for a < x < b 

b — a 


The cumulative, distribution, function is thus given by 


F(x; a, b) 


0 if x < a 

if a < x < b 

b—a — — 

1 if b < x 


40.2 Moments 

The uniform distribution has expectation value E(x) = (a + b)/ 2, variance V{x) = (6 — 
a) 2 / 12 , yU 3 = 0 , /j-4 = (b — a) 4 /80, coefficient of skewness 71 = 0 and coefficient of kurtosis 
72 = —1.2. More generally all odd central moments vanish and for n an even integer 

(6 — a) n 
fJ ' n = 2”(n+l) 

40.3 Random Number Generation 

Since we assume the presence of a pseudorandom number generator giving random numbers 
£ between zero and one a random number from the uniform distribution is simply given by 

x = (b — a)£ + a 
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41 Weibull Distribution 


41.1 Introduction 


The Weibull distribution is given by 




1 

a 



x_ 

a 


V 


where the variable x and the parameters 7] and o all are positive real numbers. The 
distribution is named after the Swedish physicist Waloddi Weibull (1887-1979) a professor 
at the Technical ffighschool in Stockholm 1924-1953. 

The parameter a is simply a scale parameter and the variable y = x/a has the distri¬ 
bution 

g(y) = v y v ~ l e~ yTI 


In figure 32 we show the distribution for a few values of rj. For tj < 1 the distribution has 
its mode at y — 0, at r) — 1 it is identical to the exponential distribution, and for y > 1 
the distribution has a mode at 


x = 



1 

V 


which approaches x = 1 as rj increases (at the same time the distribution gets more sym¬ 
metric and narrow). 



y 


Figure 32: The Weibull distribution 


152 




41.2 Cumulative Distribution 

The cumulative distribution is given ,by 

F(x) = j f{u)du = f 1 du = J e- y dy=l-e~(°y 

0 0 0 

where we have made the substitution y = ( u/a) v in order to simplify the integration. 

41.3 Moments 

Algebraic moments are given by 

OO OO / 7 \ 

E{x k ) = J x k f(x)dx = cr k J y^e~ v dy = cr k r f - + 1 j 
o o Vl / 

where we have made the same substitution as was used when evaluating the cumulative 
distribution above. 

Especially the expectation value and the variance are given by 

E (x) = aV ^ + 1 j and V (x) = a 2 |r 0 + 1 j - T ^ + 1 j | 

41.4 Random Number Generation 

To obtain random numbers from Weibull’s distribution using £, a random number uniformly 
distributed from zero to one, we may solve the equation F(x) = £ to obtain a random 
number in x. 

/ x 1 
F(x) — 1 — e = £ =>- x — a(— ln£)’i 
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42 Appendix A: The Gamma and Beta Functions 


42.1 Introduction 

In statistical calculations for standard statistical distributions such as the normal (or Gaus¬ 
sian) distribution, the Student’s /-distribution, the chi-squared distribution, and the F- 
distribution one often encounters the so called Gamma and Beta functions. More specifi¬ 
cally in calculating the probability content for these distributions the incomplete Gamma 
and Beta functions occur. In the following we briefly define these functions and give nu¬ 
merical methods on how to calculate them. Also connections to the different statistical 
distributions are given. The main references for this has been [41,42,43] for the formalism 
and [10] for the numerical methods. 


42.2 The Gamma Function 


The Gamma function is normally defined as 

OO 

T( 2 ) = J t z ~ 1 e~ t dt 

o 


where z is a complex variable with Re(z ) > 0. This is the so called Euler’s integral form for 
the Gamma function. There are, however, two other definitions worth mentioning. Firstly 
Euler’s infinite limit form 


I'M 


lim 


1-2-3 ■ ■ -n 


00 z(z + l)(z + 2) ■ • • (z + n) 


n 


0 ,- 1 ,- 2 ,... 


and secondly the infinite product form sometimes attributed to Euler and sometimes to 
Weierstrass 


1 

GX) 



n— 1 



Z < OO 


where 7 ~ 0.5772156649 is Euler’s constant. 

I 11 figure 33 we show the Gamma function for real arguments from —5 to 5. Note the 
singularities at x = 0, —1, —2,.... 

For 0 a positive real integer n we have the well known relation to the factorial function 


n\ = T(n + 1) 


and, as the factorial function, the Gamma function satisfies the recurrence relation 


Y(z+\)=zT{z) 

In the complex plane T(z) has a pole at z = 0 and at all negative integer values of z. 
The reflection formula 


r(i - z) = 


7 r 


7 TZ 


T(z) sin(7rz) Y{z + 1) sin(7rz) 
may be used in order to get function values for Re(z ) < 1 from values for Re(z) > 1. 
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X 


Figure 33: The Gamma function 


A well known approximation to the Gamma function is Stirling’s formula 


F(*) 



1 

288 z* 


139 

5184(k 3 


571 


248832CU 4 


+ ... 


for | arg z\ < i r and \z\ —> oo and where often only the first term (1) in the series expansion 
is kept in approximate calculations. For the faculty of a positive integer n one often uses 
the approximation 

n\ ~ \/27m n n e~ n 

which has the same origin and also is called Stirling’s formula. 


42.2.1 Numerical Calculation 


There are several methods to calculate the Gamma function in terms of series expansions 
etc. For numerical calculations, however, the formula by Lanczos is very useful [10] 


T(z + 1) = (z + 7 + |) 2+l e-(" + 7 + 3) 


, Cl , C 2 

c ° ^-TT ^ -TA 

z H- 1 z - b 2 


c n 

H-b e 

z + n 


for z > 0 and an optimal choice of the parameters 7 , n, and c 0 to c n . For 7 = 5 , 
n — 6 and a certain set of c’s the error is smaller than |e| < 2 • 10 -10 . This bound is 
true for all complex z in the half complex plane Re(z ) > 0. The coefficients normally 
used are c 0 = 1, Ci = 76.18009173, c 2 = -86.50532033, c 3 = 24.01409822, c 4 = -1.231739516, 
c 5 = 0.00120858003, and c 6 = -0.00000536382. Use the reflection formula given above to ob¬ 
tain results for Re(z ) < 1 e.g. for negative real arguments. Beware, however, to avoid the 
singularities. While implementing routines for the Gamma function it is recommendable 
to evaluate the natural logarithm in order to avoid numerical overflow. 
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An alternative way of evaluating lnr(z) is given in references [44,45], giving formulae 
which also are used in order to evaluate the Digamma function below. The expressions 
used for lnT(z) are 


In T(» 


(z — In 0 — 0 + \ In 2n + z 2 k( 2 k-i) z ~ 2k + Rk(z) for 0 < x 0 < x 

k= 1 

n— 1 

In T(z + n) — In U(z + k) for 0 < x < Xq 

k =0 

. Ill 7T + In T(1 — z) — In sin nz for x < 0 


Here n = [x 0 ] — [x] (the difference of integer parts, where x is the real part of z = x + ty) 
and e.g. K = 10 and xq = 7.0 gives excellent accuracy i.e. small Rk■ Note that Kolbig [45] 
gives the wrong sign on the (third) constant term in the first case above. 


42.2.2 Formulae 


Below we list some useful relations concerning the Gamma function, faculties and semi¬ 
faculties (denoted by two exclamation marks here). For a more complete list consult e.g. 

[42]- 


F(s) 
r 0 + 1 ) 
F(s) 
F(*) 


In 


2-1 


dt 


= a 


z\ = 


^r(^) = z! 

oo 

J t z ^ 1 e~ at dt for Re(z) > 0, Re (a) > 0 
o 

(k — 1)! for k > 1 ( integer , 0! = 1) 

OO 

r(^ + 1) = J e~H z dt for Re(z ) > —1 


r G) 

r(„ + i) 
r(z)r(i - z) 


' 7T 

(2n- 1)!! 


7T 


7T 


Sill 7 TZ 


z\(-z)\ 

7 TZ 


(2 m)!! 

sin 7 tz 

= 2 • 4 ■ 6 ■ ■ ■ 

2m = 2 m rn! 

\m — 1)!! 

= 1 • 3 • 5 • • • 

(2m - 1) 

(2m)! 

= (2m)!! (2m 

, — 1)!! = 2 m m!(2m — 1)!! 


42.3 Digamma Function 

It is often convenient to work with the logarithm of the Gamma function in order to avoid 
numerical overflow in the calculations. The first derivatives of this function 

, . . 1 dV(z) 

ift(z) = — lnThz) = ————-— 

J dz K J T(z) dz 
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is known as the Digamma, or Psi, function. A series expansion of this function is given by 

00 / 1 

^{z + l) = - 7-ZM— -) for ^0,-1,-2,-3,... 

where 7 ~ 0.5772156649 is Euler’s constant which is seen to be equal to —^(l). If the 
derivative of the Gamma function itself is required we may thus simply use dT(z)/dz = 
TO) • 'ip(z). Note that some authors write 'ip(z) = ^lnr(^ + 1) = ^ z\ for the Digamma 
function, and similarly for the polygamma functions below, thus shifting the argument by 
one unit. 

I 11 figure 34 we show the Digamma function for real arguments from —5 to 5. Note the 
singularities at x — 0, — 1 , —2,. ... 



x 


Figure 34: The Digamma, or Psi, function 
For integer values of 2 we may write 

n— 1 

■ 0 M = -7 + Y, — 


m— 1 


rn 


which is efficient enough for numerical calculations for not too large values of n. Similarly 
for half-integer values we have 


ri 

4> (n + |) = -7 - 2 In 2 + 2 ^ 


7 ^ 2m — 1 

However, for arbitrary arguments the series expansion above is unusable. Following the 
recipe given in an article by K. S. Kolbig [45] we use 


^(z) = 


In z — A — X] ^~ z 2k + Rk(z) for 0 < x 0 < x 

k= 1 
n—1 

^(z + n) ~ E 

k =0 

_ 1p( — z) + \ + 7T cot 7 TZ 
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for 0 < x < xo 
for x < 0 








Here n = [x 0 ] — [x] (the difference of integer parts, where x is the real part of z — x + zy) 
and we have chosen K = 10 and xo = 7.0 which gives a very good accuracy (i.e. small 
Rk , typically less than 1CF 15 ) for double precision calculations. The main interest in 
statistical calculations is normally function values for x) for real positive arguments 
but the formulae above are valid for any complex argument except for the singularities 
along the real axis at z = 0, —1, —2, —3,.... The B 2 k are Bernoulli numbers given by 


Bo — 1, Bi — — B 2 — B 4 — — B e — 1 

7 d _ 3617 D _ 43867 D _ 174611 

-Difi —-PTTT, 4318 — , -Don — 


Ba — — E, B in — 


42’ 30’ 66’ B 12 


691 
'2730 ’ 


B 14 — 


510 ’ 


798 ’ 


330 


42.4 Polygamma Function 

Higher order derivatives of lnT(z) are called Polygamma functions 11 

( ] n d n+l 

^ = = T lnr (-) for n — 1,2,3, 

Here a series expansion is given by 


^ n \z) = (—l) n+1 n! 


k =o ( 2 + k ) n+l 


for z ^ 0, —1, —2,... 


For numerical calculations we have adopted a technique similar to what was used to 
evaluate lnT(z) and tp(z ). 



(- 1 ) 


n— 1 


tl + 2z n +1 + E B 2 k (2+ Rk(z) 


k =1 
m— 1 


ip( n \z + m) — (—l)”n! E 

fc =0 v ^ ; 


for 0 < xo < x 
for 0 < x < Xo 


where H = — lnz for n = 0 and t\ = (n — l)!/z n for n > 0. Here m = [x 0 ] — [x] i.e. 
the difference of integer parts, where x is the real part of z = x + zy. We treat primarily 
the case for real positive arguments x and if complex arguments are required one ought 
to add a third reflection formula as was done in the previous cases. Without any special 
optimization we have chosen K = 14 and xo = 7.0 which gives a very good accuracy, i.e. 
small Rk, typically less than 10“ 15 , even for double precision calculations except for higher 
orders and low values of x where the function value itself gets large. 12 

For more relations on the Polygamma (and the Digamma) functions see e.g. [42], Two 
useful relations used in this document in finding cumulants for some distributions are 

■0 (n) (i) = (-l) n+1 n\((n + 1) 

V> (n) (|) = (— l) n+1 n!(2" +1 - l)C(n + 1) = (2 n+1 - 1)^ H (1) 


where ( is Riemann’s zeta function (see page 59 and [31]). 


n Sometimes the more specific notation tri-, tetra-, penta- and hexagamma functions are used for ip', 
ip", ipW and ip^\ respectively. 

12 For this calculation we need a few more Bernoulli numbers not given on page 158 above namely 


Bon 


- 

138 


, 1?24 = — 


23 636409 1 
2730 


i?26 = 


8553103 


, and B 2 s = — 


2374 9461 029 
870 


158 


















42.5 The Incomplete Gamma Function 

For the incomplete Gamma function there seem to be several definitions in the literature. 
Defining the two integrals 

X oo 

7 (a,x) = J t a ~ 1 e~ t dt and r(a, a;) = J 

o x 


with Re (a) > 0 the incomplete Gamma function is normally defined as 

l( a i x ) 


P(a,x ) = 


F(a) 


but sometimes also 7 (a,x) and r(a, x) is referred to under the same name as well as the 
complement to P(a,x ) 


Q(a, x) — 1 — P(a, x) 

Note that, by definition, 7 (a,x) + T(a, x) = T(a). 


r(a, x) 

r(a) 


In hgure 35 the incomplete Gamma function P(a,x ) is shown for a few a-values (0.5, 
1, 5 and 10). 



Figure 35: The incomplete Gamma function 


42.5.1 Numerical Calculation 

For numerical evaluations of P two formulae are useful [10]. For values x < a + 1 the series 

r(a) 


7M = e-^£ r(tt + n+ i) 


X 
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converges rapidly while for x > a + 1 the continued fraction 

/ 1 1 — a 1 2 -a 2 


r(a,x) = e~ x x a 


\£+ 1+ X-\~ 1+ X-\~ 


is a better choice. 

42.5.2 Formulae 

Below we list some relations concerning the incomplete Gamma function. For a more 
complete list consult e.g. [42], 

T(a) = 7 ( 0 , x) + T(a, x) 

X 

j(a,x) = J e -t f a-1 df for Re(a) > 0 
o 

7 (a + l,a;) = 07 ( 0 , x) — x a e~ x 
7 (n,x) = (n — 1 )! 


l-e-£ 


n-i x r 


r =0 


T\ 


T(a, x) 
T(a + 1 , x) 
T(n, x) 


= J e~H a - v dt 

X 

= ar(a, x) — x a e~ 

n -1 T , 

= (n — l)!e _a ’ £ — n = 1 , 2 ,. 


r =0 


42.5.3 Special Cases 

The usage of the incomplete Gamma function P(a, x) in calculations made in this document 
often involves integer or half-integer values for a. These cases may be solved by the following 
formulae 


P(n, x) 

= 

^ J b 1 

fc =0 

p {h x ) 

= erfy/x 

P(a + 1, x) 

= P(a,x) =P(a,x ) 

1 (a + 1 ) 

p( M 

2fc—1 

2 X 2 ^ 

= erfWx — —7 - 7 -—erf 

J v ^ r 12 fc+i\ ■' 

k =1 1 l 2 J 


aT(a) 


= erfy/x- 2 e X J- £ 


re — 1 

x ^ ( 2x ) 


k -1 




the last formula for odd values of n. 


42.6 The Beta Function 

The Beta function is defined through the integral formula 

1 

B{a, b ) = B{b, a) = J t a ~\ 1 - tf-'dt 

0 
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and is related to the Gamma function by 


B(a,b) 


mm 

T (a + b) 


The most straightforward way to calculate the Beta function is by using this last expression 
and a well optimized routine for the Gamma function. In table 9 on page 179 expressions 
for the Beta function for low integer and half-integer arguments are given. 

Another integral, obtained by the substitution x — t/(l — t), yielding the Beta function 
is 

0° 

B(o - 6 W JT^i dx 

0 v ' 


42.7 The Incomplete Beta Function 

The incomplete Beta function is defined as 


I x {a,b) 


B x (a,b) 
B(a, b) 


1 

B(a,b) 


J t a ~\l -t^dt 

o 


for a, b > 0 and 0 < x < 1. 

The function B x (a,b), often also called the incomplete Beta function, satisfies the fol¬ 
lowing formula 


B x (a,b ) 


jz - 

(1 + u) a+b 


Bi(b,a) - Bi_ x (b, a) = 



a 

... + 


+ a +1 + 2!(a + 2) + 

(1 - 6)(2-6)-.-(n-6)_ n , 

• / \ I 

n!(a + n) 


In figure 36 the incomplete Beta function is shown for a few (a, 6)-values. Note that by 
symmetry the (1, 5) and (5,1) curves are reflected around the diagonal. For large values of 
a and b the curve rises sharply from near zero to near one around x — a/{a + b). 


42.7.1 Numerical Calculation 

In order to obtain I x (a, b) the series expansion 

x a (l — x) b 


Ix(a, &) 


aB(a, b) 


i , Y' + 1, n + 1) +1 

n^o B( a + b,n+ 1 ) 


is not the most useful formula for computations. The continued fraction formula 


4(a, b) = 


x a (l — x) 1 
aB(a, b) 


1 d\ di2 
1 + 1 + 1 + 
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Figure 36: The incomplete Beta function 


turns out to be a better choice [10]. Here 

(a + m)(a + b + m)x m(b — m)x 

2m+1 (a + 2m) (a + 2m + 1) &n 2m (a + 2m — l)(a + 2m) 

and the formula converges rapidly for x < (a + l)/(a + b + 1). For other x- values the same 
formula may be used after applying the symmetry relation 

h{a,b) = 1 - h- x (b, a) 


42.7.2 Approximation 

For higher values of a and b , well already from a + b > 6, the incomplete Beta function 
may be approximated by 


• For (a + b + 1)(1 — x) < 0.8 using an approximation to the chi-square distribution in 
the variable y 2 = (a + b — 1)(1 — x)(3 — x) — (1 — x)(b** 1) with n = 2b degrees of 
freedom. 


For (a+6+l)(l— x) > 0.8 using an approximation to the standard normal distribution 
in the variable 


z = 


Wi 


( X - k) ~ ( X - ia) 


+ 


where w\ = \/bx and w 2 = y/a(l — x) 


In both cases the maximum difference to the true cumulative distribution is below 0.005 
all way down to the limit where a + b = 6 [26]. 
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42.8 Relations to Probability Density Functions 

The incomplete Gamma and Beta functions, P(a, x) and / x («, b ) are related to many stan¬ 
dard probability density functions or rather to their cumulative (distribution) functions. 
We give very brief examples here. For more details on these distributions consult any book 
in statistics. 

42.8.1 The Beta Distribution 

The cumulative distribution for the Beta distribution 

nx)= B^)h r ~' {1 - t)q ~' dt= 

i.e. simply the incomplete Beta function. 

42.8.2 The Binomial Distribution 

For the binomial distribution with parameters n and p 

it -p) n ~ 3 = I P {k,n-k+ 1) 

j—k / 

i.e. the cumulative distribution may be obtained by 

P(k) = X (")p‘(l -p) n -‘ = h-„(n-k,k + l) 
i =0 V 1 ) 

However, be careful to evaluate P(n), which obviously is unity, using the incomplete Beta 
function since this is not defined for arguments which are less or equal to zero. 


with parameters p and q is given by 
B(p,q) UP ' q> 


42.8.3 The Chi-squared Distribution 

The cumulative chi-squared distribution for n degrees of freedom is given by 


x 



where x is the chi-squared value sometimes denoted y 2 . In this calculation we made the 
simple substitution y = x/2 in simplifying the integral. 
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42.8.4 The F-distribution 

The cumulative F-distribution with m and n degrees of freedom is given by 


F(x) 


B{ ?,f) 


m n -_. m_i 

/ 7—i \ ra-|-n 

(mF + n) 


dF 


X m n 

1 r / mF \ 2 / n \ 2 dF 

B (^ VmF + ny VmF + n/ F 

V 2 ’ 2 / 0 


mx 



mx 



1 (l-y)2 1 dy = 


with z = mx/[n + mx). Here we have made the substitution y = mF/{mF + n), leading 
to dF/F = dy/y(l — y), in simplifying the integral. 


42.8.5 The Gamma Distribution 

Not surprisingly the cumulative distribution for the Gamma distribution with parameters 
a and b is given by an incomplete Gamma function. 


F(x) 


j f{x)dx = 
0 


a [ v b ~ 1 e~ au dv = 

T (b) J 



1 

m 


j v b - l e~ v dv 
0 


7(6, ax) 

m 


P{b , ax) 


v\ b 1 e _ v dx 
a) a 


42.8.6 The Negative Binomial Distribution 

The negative binomial distribution with parameters n and p is related to the incomplete 
Beta function via the relation 

it, + I 1 ^ n ( 1 _ PY = h- P (a, n) 

Also the geometric distribution, a special case of the negative binomial distribution, is 
connected to the incomplete Beta function, see summary below. 


42.8.7 The Normal Distribution 

The cumulative normal, or Gaussian, distribution is given by 13 

■ + FG.t) if *>° 


F 0 ) = 


\-\P{\4) if *<0 


13 Without loss of generality it is enough to regard the standard normal density. 
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where P is the incomplete Gamma function occurring as twice the integral of the 

standard normal curve from 0 to a; since 


V2n 


e 2 dt — 


2 2T(J) 


u 2 e U du = 


7 


Ml 

2 r(i) 


-P (i 5 !) 
> \ 2 ’ 2 ) 


The so called error function may be expressed in terms of the incomplete Gamma 
function 

nr. 

2 


erf x = —j= J e ^dt = P Q, x 2 ^j 


as is the case for the complementary error function 


erfc x = 1 — erf x 



l-P 



defined for x > 0, for x < 0 use erf(-x) = —erf(x) and erfc(-x) = 1 + erf(x). See also 
section 13. 

There are also other series expansions for erf x like 


erf x 



x x x 

3GL! + 5G2! ~ 7G5! + " ' 

1 1-3 1 - 3-5 

~ ^ + IfhPf ~ IfhPf + 


42.8.8 The Poisson Distribution 

Although the Poisson distribution is a probability density function in a discrete variable 
the cumulative distribution may be expressed in terms of the incomplete Gamma function. 
The probability for outcomes from zero to k — 1 inclusive for a Poisson distribution with 
parameter (mean) /i is 


k ~ 1 n n p -n 

*W<= E fr 

n =0 U - 


1 — P(k,fj) for A; = 1 , 2 ,... 


42.8.9 Student’s f-distribution 

The symmetric integral of the f-distribution with n degrees of freedom, often denoted 
A(t\n), is given by 


A(t\n) 




n 


n + x 2 


n +1 

2 


dx = 
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with * = t 2 /(n + t 2 ). 

42.8.10 Summary 

The following table summarizes the relations between the cumulative, distribution, func¬ 
tions of some standard probability density functions and the incomplete Gamma and Beta 
functions. 


Distribution 

Parameters 

Cumulative distribution 

Range 

Beta 

p, q 

F(x) 

= Ix(PFl) 

0 < x < 1 

Binomial 

n, p 

P(k) 

= Ii_ p (n-k, k+ 1) 

k — 0,1,..., n 

Chi-squared 

n 

F(x) 

__ p( n x\ 
r \ 2 ’ 2 / 

x > 0 

F 

m, n 

F(x) 

t f m n\ 

n+rnx V 2 ’ 2 ) 

x > 0 

Gamma 

a, b 

F(x) 

= P(b, ax) 

x > 0 

Geometric 

P 

P(k) 

= I P (l,k) 

k = 1,2,... 

Negative binomial 

n, p 

P(k) 

= I p (n, k + 1) 

k = 0,1,... 

Standard normal 


F(x) 

_ 1 1 p( 1 x 2 \ 

2 2 r V2’ 2 ) 

—oo < x < 0 



F(x) 

= l + \P(\,^) 

0 < x < oo 

Poisson 

p 

P(k) 

= 1 - P(k+ 1,/i) 

k = 0,1,... 

Student 

n 

F{x) 

F(x) 

= 1-1/ 2 (l n) 

2 2 V2’ 2 ) 

n-\-x^ 

1 i It (1 n\ 

2 2 J ^e1 7 V2’ 2 / 

n+x^ 

—oo < x < 0 

0 < x < oo 
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43 Appendix B: Hypergeometric Functions 

43.1 Introduction 

The hypergeometric and the confluent hypergeometric functions has a central role inasmuch 
as many standard functions may be expressed in terms of them. This appendix is based on 
information from [41,46,47] in which much more detailed information on the hypergeometric 
and confluent hypergeometric function may be found. 


43.2 Hypergeometric Function 

The hypergeometric function, sometimes called Gauss’s differential equation, is given by 
[41,46] 

* (1 - + [c - (a + 6 + 1)*] - abf(x) = 0 

OX 2 ox 

One solution is 


f(x) = 2 F 1 (a,b,c\x) 


1 + 


ab x 
c 1 ! 


+ 


a(a + 1 ) 6(6 + 1 ) x 2 
c(c T 1) 2!~ 


+ ... 


c 7 ^ 0) G 2, 3,... 


The range of convergence is |x| < 1 and x = 1, for c > a + 6 , and x = —1, for c > a + 6 — 1. 
Using the so called Pochhammer symbol 


(a) n = a(a + l)(a + 2 ) • • • (a + n — 1 ) = 


with (a)o = 1 this solution may be written 11 as 


(a + n — 1 )! T(a + n) 


(a-1)! 


T(a) 


“ {a) n (b) n x n Tc ™r(a + n)r(b + n)x n 

2 F 1 a, 6 , c; x = A ~Td-T = w 2 ^- w , x -r 

^0 ( C U n! r ( a ) r ( & ),^o T(c + n) n\ 

By symmetry 2 Fi(a, 6 , c; x) = 2 - 61 ( 6 , a, c; x) and sometimes the indices are dropped and 
when the risk for confusion is negligible one simply writes F(a,b,c,x). 

Another independent solution to the hypergeometric equation is 

f(x) = x 1_c 2 p\(a + l — c, 6+1 —c, 2 —c; x) c 7 ^ 2, 3,4,... 

The ruth derivative of the hypergeometric function is given by 

2 F 1 (a,b,c-,x) = 2 Fi(a+n,b+n,c+rr,x ) 


dx 


(C)r 


and 

2 Fi(a, 6 , c; x) = (1 — x) c ~ a ~ b 2 Fi(c—a, c—b , c; x) 

Several common mathematical function may be expressed in terms of the hypergeomet¬ 
ric function such as, the incomplete Beta function B x (a, 6 ), the complete elliptical integrals 

14 The notation 2+1 indicates the presence of two Pochhammer symbols in the numerator and one in the 
denominator. 
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K and E, the Gegenbauer functions T£(x), the Legendre functions P n (x), P™(x) and Q v (x) 
(second kind), and the Chebyshev functions T n (x), U n (x) and V n (x) 

(1 ~ z)~ a = 2 Fi(a,b,b-z) 

x • 2 Fi(1 , 1,2; -z) 

z ~ z n(H |; j2 ) = zVT^Fz 2 2 Fi(i, i, § -,z 2 'j 


ln(l + z) 
arctan 2 


arcsm z = 


B x (a, b) 
K 


x 

a 

7r 
2 


2 Fi(a, l-b,a + l;x) 

/(l- k 2 sin 2 6)~^d6 = ^ 2 -Pi (§, 1; & 2 ) 

0 

7r 

2 __ 
y (1 - k 2 sin 2 9)^d9 = ^ 2 Fi (§, - 5 , 1; & 2 ) 
0 

T n( x ) = 2 -Fi(-n,?r+2/3+ 1 ,1+/3; i y L ) 

(®) = 2 -Pi(-n,w + l,l; 1 = 5 ) 


E = 


Pjx 


Pn(x 
P2n(x 
P2n+l(x) 

Q v (*^) 

F n (x) 

U n (x) 

V n (x) 


(n + m)\ (1 — x 2 )™ 
(n — m)\ 2 m m\ 

( 2 n - 1 )!! 


2 F 1 (m—n, m+n+1, m+1; ip) 


(x) = (~!)" ( 2yz )p '' 2^1 (~n, n+l, I;x 2 ) 

(- 1)n (2 (2n)IT a; -2^i(-^^ + |,|;x 2 ) 

__ p / i/+l v 1 -I v+3 . 1 1 

(^+l)!(2 x)-+‘ 2 l( - 2 ’’ ’ 2 'P 

2 Fi(-n,n, |;ip) 

(n + 1) • 2 i ? i(-n,n+2, §; ip) 

Vl — x 2 2 -Pi(—n+l, n+l, |; ip) 


= n 


for Q v (x ) the conditions are |x| > 1, | argp < 7r, and n p —1, —2, —3,.... See [46] for 
many more similar and additional formulae. 

43.3 Confluent Hypergeometric Function 

The confluent hypergeometric equation, or Kummer’s equation as it is often called, is given 
by [41,47] 

P-^ + (c-x) a ^-af( X ) = 0 

ox z ox 

One solution to this equation is 


00 (a) r n 

f(x) = iF\(a, c; x) = M(a,c\x) = ^ ry 1 — 

n =0 \ C J n 71 ' 


0 ,- 1 ,- 2 ,... 
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This solution is convergent for all finite real x (or complex z). Another solution is given by 
f(x) = x 1 ~ c M(a+l — c, 2 — c;x) 2, 3,4,... 


Often a linear combination of the first and second solution is used 


U (a, c; x) = 


71 

sm 7 re 


M(a,c;x ) or c M(a+l —c, 2 —c; x) 


(a—c)!(c— 1 )! 


(a— 1 )!( 1 —c)! 


The confluent hypergeometric functions M and U may be expressed in integral form as 

M(a,c,x ) = —-—-- / e xt t a ~ l (l _ t) c ~ a ~ 1 dt Re c > 0, Re a > 0 

T(a)T(c — a) i 

1 °° 

U(a,c;x ) = —— / e _xi f a_1 (l + t) c ~ a ~ 1 dt Re x > 0, Re a > 0 

ra / 


Useful formulae are the Kumnier transformations 


M(a,c;x ) = e x M(c—a,c]—x ) 

U(a,c;x ) = x l ~ c U(a — c+1 ,2 — c; x) 

The ruth derivatives of the confluent hypergeometric functions are given by 

d n ('o') 

—M(a,b-,z) = ——^M(a+n, b+n; z) 
dz n (b) n 

d n 

— U(a,b;z) = (-1 ) n (a) n U(a+n,b+n] z) 

Several common mathematical function may be expressed in terms of the hypergeomet¬ 
ric function such as the error function, the incomplete Gamma function 7 (a, x), Bessel func¬ 
tions J v (x), modified Bessel functions of the first kind Rix), Hermite functions H n (x), La- 
guerre functions L n (x), associated Laguerre functions L™(x), Whittaker functions M^ix) 
and Wkfj(x), Fresnel integrals C(x) and S(x), modified Bessel function of the second kind 
K v {x) 


erf(x) 
l(a,x) 
jV ( 3 ?) 
Iv(x) 

H 2n+1 (x) 


M(a, a; z) 

z_xM Q, |; — x‘ 


n 


x 


— M (a, a+ 1 ; —x) 


a 

n — lX 


M(u+ 


) = ~^=xe x ~Al( 1 

' V 71 v 

Re a > 0 
2 z/+l; 2 ra;) 


3 . 
’ 2 ’ 



(- 1 )' 


(2n)! 


M(v+±,2v+l;2x 
M^—n, 


- 1) " 2(2 ”„| ».1 ; G 
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L n (x) = M{-n,l\x) 

f) rn (71 - 1- 777 ^ t 

-1 ) m —L n+m (x) = { -^^M(-n,m+l ] x) 


L™(x) = 


dx r 


n\m\ 


M k/X (x) = e 2^ + 2M(yU-A; + |,2/i+l;a:) 

Wkn(x) = e~^x ,J ' + 2U(ji—k+\, 2/i+l; x) 

. , r /l 3 27 tx 2 \ 

C(x) + nS(x) = J 

^(x) = ^7re~ x (2x) u u(u+^,2u+l-, 2x) 

See [47] for many more similar and additional formulae. 
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Table 1: Percentage points of the chi-square distribution 






1 - 

- a 




n 

0.5000 

0.8000 

0.9000 

0.9500 

0.9750 

0.9900 

0.9950 

0.9990 

1 

0.4549 

1.6424 

2.7055 

3.8415 

5.0239 

6.6349 

7.8794 

10.828 

2 

1.3863 

3.2189 

4.6052 

5.9915 

7.3778 

9.2103 

10.597 

13.816 

3 

2.3660 

4.6416 

6.2514 

7.8147 

9.3484 

11.345 

12.838 

16.266 

4 

3.3567 

5.9886 

7.7794 

9.4877 

11.143 

13.277 

14.860 

18.467 

5 

4.3515 

7.2893 

9.2364 

11.070 

12.833 

15.086 

16.750 

20.515 

6 

5.3481 

8.5581 

10.645 

12.592 

14.449 

16.812 

18.548 

22.458 

7 

6.3458 

9.8032 

12.017 

14.067 

16.013 

18.475 

20.278 

24.322 

8 

7.3441 

11.030 

13.362 

15.507 

17.535 

20.090 

21.955 

26.124 

9 

8.3428 

12.242 

14.684 

16.919 

19.023 

21.666 

23.589 

27.877 

10 

9.3418 

13.442 

15.987 

18.307 

20.483 

23.209 

25.188 

29.588 

11 

10.341 

14.631 

17.275 

19.675 

21.920 

24.725 

26.757 

31.264 

12 

11.340 

15.812 

18.549 

21.026 

23.337 

26.217 

28.300 

32.909 

13 

12.340 

16.985 

19.812 

22.362 

24.736 

27.688 

29.819 

34.528 

14 

13.339 

18.151 

21.064 

23.685 

26.119 

29.141 

31.319 

36.123 

15 

14.339 

19.311 

22.307 

24.996 

27.488 

30.578 

32.801 

37.697 

16 

15.338 

20.465 

23.542 

26.296 

28.845 

32.000 

34.267 

39.252 

17 

16.338 

21.615 

24.769 

27.587 

30.191 

33.409 

35.718 

40.790 

18 

17.338 

22.760 

25.989 

28.869 

31.526 

34.805 

37.156 

42.312 

19 

18.338 

23.900 

27.204 

30.144 

32.852 

36.191 

38.582 

43.820 

20 

19.337 

25.038 

28.412 

31.410 

34.170 

37.566 

39.997 

45.315 

21 

20.337 

26.171 

29.615 

32.671 

35.479 

38.932 

41.401 

46.797 

22 

21.337 

27.301 

30.813 

33.924 

36.781 

40.289 

42.796 

48.268 

23 

22.337 

28.429 

32.007 

35.172 

38.076 

41.638 

44.181 

49.728 

24 

23.337 

29.553 

33.196 

36.415 

39.364 

42.980 

45.559 

51.179 

25 

24.337 

30.675 

34.382 

37.652 

40.646 

44.314 

46.928 

52.620 

26 

25.336 

31.795 

35.563 

38.885 

41.923 

45.642 

48.290 

54.052 

27 

26.336 

32.912 

36.741 

40.113 

43.195 

46.963 

49.645 

55.476 

28 

27.336 

34.027 

37.916 

41.337 

44.461 

48.278 

50.993 

56.892 

29 

28.336 

35.139 

39.087 

42.557 

45.722 

49.588 

52.336 

58.301 

30 

29.336 

36.250 

40.256 

43.773 

46.979 

50.892 

53.672 

59.703 

40 

39.335 

47.269 

51.805 

55.758 

59.342 

63.691 

66.766 

73.402 

50 

49.335 

58.164 

63.167 

67.505 

71.420 

76.154 

79.490 

86.661 

60 

59.335 

68.972 

74.397 

79.082 

83.298 

88.379 

91.952 

99.607 

70 

69.334 

79.715 

85.527 

90.531 

95.023 

100.43 

104.21 

112.32 

80 

79.334 

90.405 

96.578 

101.88 

106.63 

112.33 

116.32 

124.84 

90 

89.334 

101.05 

107.57 

113.15 

118.14 

124.12 

128.30 

137.21 

100 

99.334 

111.67 

118.50 

124.34 

129.56 

135.81 

140.17 

149.45 
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Table 2: Extreme confidence levels for the chi-square distribution 



Chi-square Confidence Levels (as x 2 values) 

d.f. 

0.1 

0.01 

10~ 3 

10" 4 

10~ 5 

10“ 6 

io - 7 

10“ 8 

l 0 -9 

1 0 - 10 

10 ~ n 

io - 12 

1 

2.71 

6.63 

10.8 

15.1 

19.5 

23.9 

28.4 

32.8 

37.3 

41.8 

46.3 

50.8 

2 

4.61 

9.21 

13.8 

18.4 

23.0 

27.6 

32.2 

36.8 

41.4 

46.1 

50.7 

55.3 

3 

6.25 

11.3 

16.3 

21.1 

25.9 

30.7 

35.4 

40.1 

44.8 

49.5 

54.2 

58.9 

4 

7.78 

13.3 

18.5 

23.5 

28.5 

33.4 

38.2 

43.1 

47.9 

52.7 

57.4 

62.2 

5 

9.24 

15.1 

20.5 

25.7 

30.9 

35.9 

40.9 

45.8 

50.7 

55.6 

60.4 

65.2 

6 

10.6 

16.8 

22.5 

27.9 

33.1 

38.3 

43.3 

48.4 

53.3 

58.3 

63.2 

68.1 

7 

12.0 

18.5 

24.3 

29.9 

35.3 

40.5 

45.7 

50.8 

55.9 

60.9 

65.9 

70.8 

8 

13.4 

20.1 

26.1 

31.8 

37.3 

42.7 

48.0 

53.2 

58.3 

63.4 

68.4 

73.5 

9 

14.7 

21.7 

27.9 

33.7 

39.3 

44.8 

50.2 

55.4 

60.7 

65.8 

70.9 

76.0 

10 

16.0 

23.2 

29.6 

35.6 

41.3 

46.9 

52.3 

57.7 

62.9 

68.2 

73.3 

78.5 

11 

17.3 

24.7 

31.3 

37.4 

43.2 

48.9 

54.4 

59.8 

65.2 

70.5 

75.7 

80.9 

12 

18.5 

26.2 

32.9 

39.1 

45.1 

50.8 

56.4 

61.9 

67.3 

72.7 

78.0 

83.2 

13 

19.8 

27.7 

34.5 

40.9 

46.9 

52.7 

58.4 

64.0 

69.5 

74.9 

80.2 

85.5 

14 

21.1 

29.1 

36.1 

42.6 

48.7 

54.6 

60.4 

66.0 

71.6 

77.0 

82.4 

87.8 

15 

22.3 

30.6 

37.7 

44.3 

50.5 

56.5 

62.3 

68.0 

73.6 

79.1 

84.6 

90.0 

16 

23.5 

32.0 

39.3 

45.9 

52.2 

58.3 

64.2 

70.0 

75.7 

81.2 

86.7 

92.2 

17 

24.8 

33.4 

40.8 

47.6 

54.0 

60.1 

66.1 

71.9 

77.6 

83.3 

88.8 

94.3 

18 

26.0 

34.8 

42.3 

49.2 

55.7 

61.9 

68.0 

73.8 

79.6 

85.3 

90.9 

96.4 

19 

27.2 

36.2 

43.8 

50.8 

57.4 

63.7 

69.8 

75.7 

81.6 

87.3 

92.9 

98.5 

20 

28.4 

37.6 

45.3 

52.4 

59.0 

65.4 

71.6 

77.6 

83.5 

89.3 

94.9 

101 

25 

34.4 

44.3 

52.6 

60.1 

67.2 

73.9 

80.4 

86.6 

92.8 

98.8 

105 

111 

30 

40.3 

50.9 

59.7 

67.6 

75.0 

82.0 

88.8 

95.3 

102 

108 

114 

120 

35 

46.1 

57.3 

66.6 

74.9 

82.6 

89.9 

97.0 

104 

110 

117 

123 

129 

40 

51.8 

63.7 

73.4 

82.1 

90.1 

97.7 

105 

112 

119 

125 

132 

138 

45 

57.5 

70.0 

80.1 

89.1 

97.4 

105 

113 

120 

127 

134 

140 

147 

50 

63.2 

76.2 

86.7 

96.0 

105 

113 

120 

128 

135 

142 

149 

155 

60 

74.4 

88.4 

99.6 

110 

119 

127 

135 

143 

150 

158 

165 

172 

70 

85.5 

100 

112 

123 

132 

141 

150 

158 

166 

173 

181 

188 

80 

96.6 

112 

125 

136 

146 

155 

164 

172 

180 

188 

196 

204 

90 

108 

124 

137 

149 

159 

169 

178 

187 

195 

203 

211 

219 

100 

118 

136 

149 

161 

172 

182 

192 

201 

209 

218 

226 

234 

120 

140 

159 

174 

186 

198 

209 

219 

228 

237 

246 

255 

263 

150 

173 

193 

209 

223 

236 

247 

258 

268 

278 

288 

297 

306 

200 

226 

249 

268 

283 

297 

310 

322 

333 

344 

355 

365 

374 

300 

332 

360 

381 

400 

416 

431 

445 

458 

471 

483 

495 

506 

400 

437 

469 

493 

514 

532 

549 

565 

580 

594 

607 

620 

632 

500 

541 

576 

603 

626 

646 

665 

682 

698 

714 

728 

742 

756 

600 

645 

684 

713 

737 

759 

779 

798 

815 

832 

847 

862 

877 

800 

852 

896 

929 

957 

982 

1005 

1026 

1045 

1064 

1081 

1098 

1114 

1000 

1058 

1107 

1144 

1175 

1202 

1227 

1250 

1272 

1292 

1311 

1330 

1348 
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Table 3: Extreme confidence levels for the chi-square distribution (as x 2 /d.f. values) 


Chi-square Confidence Levels (as x /d.f. values) 

d.f. 

0.1 

0.01 

h - 1 

o 

1 

CO 

10 -4 

10” 5 

10" 6 

10” 7 

10" 8 

10" 9 

10“ 10 

KT 11 

1 ( T 15 

1 

2.71 

6.63 

10.83 

15.14 

19.51 

23.93 

28.37 

32.84 

37.32 

41.82 

46.33 

50.84 

2 

2.30 

4.61 

6.91 

9.21 

11.51 

13.82 

16.12 

18.42 

20.72 

23.03 

25.33 

27.63 

3 

2.08 

3.78 

5.42 

7.04 

8.63 

10.22 

11.80 

13.38 

14.95 

16.51 

18.08 

19.64 

4 

1.94 

3.32 

4.62 

5.88 

7.12 

8.34 

9.56 

10.77 

11.97 

13.17 

14.36 

15.55 

5 

1.85 

3.02 

4.10 

5.15 

6.17 

7.18 

8.17 

9.16 

10.14 

11.11 

12.08 

13.05 

6 

1.77 

2.80 

3.74 

4.64 

5.52 

6.38 

7.22 

8.06 

8.89 

9.72 

10.54 

11.35 

7 

1.72 

2.64 

3.47 

4.27 

5.04 

5.79 

6.53 

7.26 

7.98 

8.70 

9.41 

10.12 

8 

1.67 

2.51 

3.27 

3.98 

4.67 

5.34 

6.00 

6.65 

7.29 

7.92 

8.56 

9.18 

9 

1.63 

2.41 

3.10 

3.75 

4.37 

4.98 

5.57 

6.16 

6.74 

7.31 

7.88 

8.45 

10 

1.60 

2.32 

2.96 

3.56 

4.13 

4.69 

5.23 

5.77 

6.29 

6.82 

7.33 

7.85 

11 

1.57 

2.25 

2.84 

3.40 

3.93 

4.44 

4.94 

5.44 

5.92 

6.41 

6.88 

7.35 

12 

1.55 

2.18 

2.74 

3.26 

3.76 

4.24 

4.70 

5.16 

5.61 

6.06 

6.50 

6.93 

13 

1.52 

2.13 

2.66 

3.14 

3.61 

4.06 

4.49 

4.92 

5.34 

5.76 

6.17 

6.58 

14 

1.50 

2.08 

2.58 

3.04 

3.48 

3.90 

4.31 

4.72 

5.11 

5.50 

5.89 

6.27 

15 

1.49 

2.04 

2.51 

2.95 

3.37 

3.77 

4.16 

4.54 

4.91 

5.28 

5.64 

6.00 

16 

1.47 

2.00 

2.45 

2.87 

3.27 

3.65 

4.01 

4.37 

4.73 

5.08 

5.42 

5.76 

17 

1.46 

1.97 

2.40 

2.80 

3.17 

3.54 

3.89 

4.23 

4.57 

4.90 

5.22 

5.55 

18 

1.44 

1.93 

2.35 

2.73 

3.09 

3.44 

3.78 

4.10 

4.42 

4.74 

5.05 

5.36 

19 

1.43 

1.90 

2.31 

2.67 

3.02 

3.35 

3.67 

3.99 

4.29 

4.59 

4.89 

5.18 

20 

1.42 

1.88 

2.27 

2.62 

2.95 

3.27 

3.58 

3.88 

4.17 

4.46 

4.75 

5.03 

25 

1.38 

1.77 

2.10 

2.41 

2.69 

2.96 

3.21 

3.47 

3.71 

3.95 

4.19 

4.42 

30 

1.34 

1.70 

1.99 

2.25 

2.50 

2.73 

2.96 

3.18 

3.39 

3.60 

3.80 

4.00 

35 

1.32 

1.64 

1.90 

2.14 

2.36 

2.57 

2.77 

2.96 

3.15 

3.34 

3.52 

3.69 

40 

1.30 

1.59 

1.84 

2.05 

2.25 

2.44 

2.62 

2.80 

2.97 

3.13 

3.29 

3.45 

45 

1.28 

1.55 

1.78 

1.98 

2.16 

2.34 

2.50 

2.66 

2.82 

2.97 

3.12 

3.26 

50 

1.26 

1.52 

1.73 

1.92 

2.09 

2.25 

2.41 

2.55 

2.70 

2.84 

2.97 

3.11 

60 

1.24 

1.47 

1.66 

1.83 

1.98 

2.12 

2.25 

2.38 

2.51 

2.63 

2.75 

2.86 

70 

1.22 

1.43 

1.60 

1.75 

1.89 

2.02 

2.14 

2.25 

2.37 

2.48 

2.58 

2.68 

80 

1.21 

1.40 

1.56 

1.70 

1.82 

1.94 

2.05 

2.15 

2.26 

2.35 

2.45 

2.54 

90 

1.20 

1.38 

1.52 

1.65 

1.77 

1.87 

1.98 

2.07 

2.17 

2.26 

2.35 

2.43 

100 

1.18 

1.36 

1.49 

1.61 

1.72 

1.82 

1.92 

2.01 

2.09 

2.18 

2.26 

2.34 

120 

1.17 

1.32 

1.45 

1.55 

1.65 

1.74 

1.82 

1.90 

1.98 

2.05 

2.12 

2.19 

150 

1.15 

1.29 

1.40 

1.49 

1.57 

1.65 

1.72 

1.79 

1.85 

1.92 

1.98 

2.04 

200 

1.13 

1.25 

1.34 

1.42 

1.48 

1.55 

1.61 

1.67 

1.72 

1.77 

1.82 

1.87 

300 

1.11 

1.20 

1.27 

1.33 

1.39 

1.44 

1.48 

1.53 

1.57 

1.61 

1.65 

1.69 

400 

1.09 

1.17 

1.23 

1.28 

1.33 

1.37 

1.41 

1.45 

1.48 

1.52 

1.55 

1.58 

500 

1.08 

1.15 

1.21 

1.25 

1.29 

1.33 

1.36 

1.40 

1.43 

1.46 

1.48 

1.51 

600 

1.07 

1.14 

1.19 

1.23 

1.27 

1.30 

1.33 

1.36 

1.39 

1.41 

1.44 

1.46 

800 

1.06 

1.12 

1.16 

1.20 

1.23 

1.26 

1.28 

1.31 

1.33 

1.35 

1.37 

1.39 

1000 

1.06 

1.11 

1.14 

1.17 

1.20 

1.23 

1.25 

1.27 

1.29 

1.31 

1.33 

1.35 


Table 4: Exact and approximate values for the Bernoulli numbers 


Bernoulli numbers 

Tl 

N/D 

= 

But 10 fe 

k 

0 

1/1 

= 

1.00000 00000 

0 

l 

- 1/2 

= 

- 5.00000 00000 

-1 

2 

1/6 

= 

1.66666 66667 

-1 

4 

- 1/30 

= 

- 3.33333 33333 

-2 

6 

1/42 

= 

2.38095 23810 

-2 

8 

- 1/30 

= 

- 3.33333 33333 

-2 

10 

5/66 

= 

7.57575 75758 

-2 

12 

- 691/2730 

= 

- 2.53113 55311 

-1 

14 

7/6 

= 

1.16666 66667 

0 

16 

-3 617/510 

= 

7.09215 68627 

0 

18 

43867/798 

= 

5.49711 77945 

1 

20 

- 174611/330 

= 

- 5.29124 24242 

2 

22 

854513/138 

= 

6.19212 31884 

3 

24 

-236 364 091/2 730 

= 

- 8.65802 53114 

4 

26 

8553103/6 

= 

1.42551 71667 

6 

28 

-23 749461029/870 

= 

- 2.72982 31068 

7 

30 

8 615 841276 005/14 322 

= 

6.01580 87390 

8 

32 

-7 709 321041217/510 

= 

- 1.51163 15767 

10 

34 

2 577687858 367/6 

= 

4.29614 64306 

11 

36 

-26 315 271553 053 477 373/1919 190 

= 

- 1.37116 55205 

13 

38 

2 929 993 913 841559/6 

= 

4.88332 31897 

14 

40 

-261082 718 496 449122 051/13 530 

= 

- 1.92965 79342 

16 

42 

1520 097 643 918 070 802 691/1806 

= 

8.41693 04757 

17 

44 

-27 833 269 579 301024 235 023/690 

= 

- 4.03380 71854 

19 

46 

596 451 111 593 912163 277 961/282 

= 

2.11507 48638 

21 

48 

-5 609 403 368 997 817 686 249127 547/46 410 

= 

- 1.20866 26522 

23 

50 

495 057 205 241079 648 212 477 525/66 

= 

7.50086 67461 

24 

52 

-801165 718 135 489 957 347 924 991853/1590 

= 

- 5.03877 81015 

26 

54 

29 149 963 634 884 862 421418123 812 691/798 

= 

3.65287 76485 

28 

56 

-2 479 392 929 313 226 753 685 415 739 663 229/870 

= 

- 2.84987 69302 

30 

58 

84 483 613 348 880 041862 046 775 994 036 021/354 

= 

2.38654 27500 

32 

60 

-1215 233 140 483 755 572 040 304 994 079 820 246 041491/56 786 730 

— 

- 2.13999 49257 

34 

62 

12 300 585 434 086 858 541953 039 857 403 386 151/6 

= 

2.05009 75723 

36 

64 

-106 783 830 147 866 529 886 385 444 979 142 647 942 017/510 

= 

- 2.09380 05911 

38 

66 

1472 600 022126 335 654 051619 428 551932 342 241899 101/64 722 

= 

2.27526 96488 

40 

68 

-78 773130 858 718 728 141909 149 208 474 606 244 347 001/30 

= 

- 2.62577 10286 

42 

70 

1505 381347 333 367 003 803 076 567 377 857 208 511438160 235/4 686 

= 

3.21250 82103 

44 
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Table 5: Percentage 



a = 

0.01 









m 

1 

2 

3 

4 

1 

4052 

5000 

5403 

5625 

2 

98.50 

99.00 

99.17 

99.25 

3 

34.12 

30.82 

29.46 

28.71 

4 

21.20 

18.00 

16.69 

15.98 

5 

16.26 

13.27 

12.06 

11.39 

10 

10.04 

7.559 

6.552 

5.994 

20 

8.096 

5.849 

4.938 

4.431 

50 

7.171 

5.057 

4.199 

3.720 

100 

6.895 

4.824 

3.984 

3.513 

oo 

6.635 

4.605 

3.782 

3.319 


points of the F-distribution 


n 

5 

10 

20 

50 

100 

oo 

57.24 

60.19 

61.74 

62.69 

63.01 

63.33 

9.293 

9.392 

9.441 

9.471 

9.481 

9.491 

5.309 

5.230 

5.184 

5.155 

5.144 

5.134 

4.051 

3.920 

3.844 

3.795 

3.778 

3.761 

3.453 

3.297 

3.207 

3.147 

3.126 

3.105 

2.522 

2.323 

2.201 

2.117 

2.087 

2.055 

2.158 

1.937 

1.794 

1.690 

1.650 

1.607 

1.966 

1.729 

1.568 

1.441 

1.388 

1.327 

1.906 

1.663 

1.494 

1.355 

1.293 

1.214 

1.847 

1.599 

1.421 

1.263 

1.185 

1.000 


n 


5 

10 

20 

50 

100 

oo 

230.2 

241.9 

248.0 

251.8 

253.0 

254.3 

19.30 

19.40 

19.45 

19.48 

19.49 

19.50 

9.013 

8.786 

8.660 

8.581 

8.554 

8.526 

6.256 

5.964 

5.803 

5.699 

5.664 

5.628 

5.050 

4.735 

4.558 

4.444 

4.405 

4.365 

3.326 

2.978 

2.774 

2.637 

2.588 

2.538 

2.711 

2.348 

2.124 

1.966 

1.907 

1.843 

2.400 

2.026 

1.784 

1.599 

1.525 

1.438 

2.305 

1.927 

1.676 

1.477 

1.392 

1.283 

2.214 

1.831 

1.571 

1.350 

1.243 

1.000 


n 


5 

10 

20 

50 

100 

oo 

5764 

6056 

6209 

6303 

6334 

6366 

99.30 

99.40 

99.45 

99.48 

99.49 

99.50 

28.24 

27.23 

26.69 

26.35 

26.24 

26.13 

15.52 

14.55 

14.02 

13.69 

13.58 

13.46 

10.97 

10.05 

9.553 

9.238 

9.130 

9.020 

5.636 

4.849 

4.405 

4.115 

4.014 

3.909 

4.103 

3.368 

2.938 

2.643 

2.535 

2.421 

3.408 

2.698 

2.265 

1.949 

1.825 

1.683 

3.206 

2.503 

2.067 

1.735 

1.598 

1.427 

3.017 

2.321 

1.878 

1.523 

1.358 

1.000 



Table 6: Probability content from — z to z of Gauss distribution in % 


z 

0.00 

0.01 

0.02 

0.03 

0.04 

0.05 

0.06 

0.07 

0.08 

0.09 

0.0 

0.00 

0.80 

1.60 

2.39 

3.19 

3.99 

4.78 

5.58 

6.38 

7.17 

0.1 

7.97 

8.76 

9.55 

10.34 

11.13 

11.92 

12.71 

13.50 

14.28 

15.07 

0.2 

15.85 

16.63 

17.41 

18.19 

18.97 

19.74 

20.51 

21.28 

22.05 

22.82 

0.3 

23.58 

24.34 

25.10 

25.86 

26.61 

27.37 

28.12 

28.86 

29.61 

30.35 

0.4 

31.08 

31.82 

32.55 

33.28 

34.01 

34.73 

35.45 

36.16 

36.88 

37.59 

0.5 

38.29 

38.99 

39.69 

40.39 

41.08 

41.77 

42.45 

43.13 

43.81 

44.48 

0.6 

45.15 

45.81 

46.47 

47.13 

47.78 

48.43 

49.07 

49.71 

50.35 

50.98 

0.7 

51.61 

52.23 

52.85 

53.46 

54.07 

54.67 

55.27 

55.87 

56.46 

57.05 

0.8 

57.63 

58.21 

58.78 

59.35 

59.91 

60.47 

61.02 

61.57 

62.11 

62.65 

0.9 

63.19 

63.72 

64.24 

64.76 

65.28 

65.79 

66.29 

66.80 

67.29 

67.78 

1.0 

68.27 

68.75 

69.23 

69.70 

70.17 

70.63 

71.09 

71.54 

71.99 

72.43 

1.1 

72.87 

73.30 

73.73 

74.15 

74.57 

74.99 

75.40 

75.80 

76.20 

76.60 

1.2 

76.99 

77.37 

77.75 

78.13 

78.50 

78.87 

79.23 

79.59 

79.95 

80.29 

1.3 

80.64 

80.98 

81.32 

81.65 

81.98 

82.30 

82.62 

82.93 

83.24 

83.55 

1.4 

83.85 

84.15 

84.44 

84.73 

85.01 

85.29 

85.57 

85.84 

86.11 

86.38 

1.5 

86.64 

86.90 

87.15 

87.40 

87.64 

87.89 

88.12 

88.36 

88.59 

88.82 

1.6 

89.04 

89.26 

89.48 

89.69 

89.90 

90.11 

90.31 

90.51 

90.70 

90.90 

1.7 

91.09 

91.27 

91.46 

91.64 

91.81 

91.99 

92.16 

92.33 

92.49 

92.65 

1.8 

92.81 

92.97 

93.12 

93.27 

93.42 

93.57 

93.71 

93.85 

93.99 

94.12 

1.9 

94.26 

94.39 

94.51 

94.64 

94.76 

94.88 

95.00 

95.12 

95.23 

95.34 

2.0 

95.45 

95.56 

95.66 

95.76 

95.86 

95.96 

96.06 

96.15 

96.25 

96.34 

2.1 

96.43 

96.51 

96.60 

96.68 

96.76 

96.84 

96.92 

97.00 

97.07 

97.15 

2.2 

97.22 

97.29 

97.36 

97.43 

97.49 

97.56 

97.62 

97.68 

97.74 

97.80 

2.3 

97.86 

97.91 

97.97 

98.02 

98.07 

98.12 

98.17 

98.22 

98.27 

98.32 

2.4 

98.36 

98.40 

98.45 

98.49 

98.53 

98.57 

98.61 

98.65 

98.69 

98.72 

2.5 

98.76 

98.79 

98.83 

98.86 

98.89 

98.92 

98.95 

98.98 

99.01 

99.04 

2.6 

99.07 

99.09 

99.12 

99.15 

99.17 

99.20 

99.22 

99.24 

99.26 

99.29 

2.7 

99.31 

99.33 

99.35 

99.37 

99.39 

99.40 

99.42 

99.44 

99.46 

99.47 

2.8 

99.49 

99.50 

99.52 

99.53 

99.55 

99.56 

99.58 

99.59 

99.60 

99.61 

2.9 

99.63 

99.64 

99.65 

99.66 

99.67 

99.68 

99.69 

99.70 

99.71 

99.72 

3.0 

99.73 

99.74 

99.75 

99.76 

99.76 

99.77 

99.78 

99.79 

99.79 

99.80 

3.1 

99.81 

99.81 

99.82 

99.83 

99.83 

99.84 

99.84 

99.85 

99.85 

99.86 

3.2 

99.86 

99.87 

99.87 

99.88 

99.88 

99.88 

99.89 

99.89 

99.90 

99.90 

3.3 

99.90 

99.91 

99.91 

99.91 

99.92 

99.92 

99.92 

99.92 

99.93 

99.93 

3.4 

99.93 

99.94 

99.94 

99.94 

99.94 

99.94 

99.95 

99.95 

99.95 

99.95 

3.5 

99.95 

99.96 

99.96 

99.96 

99.96 

99.96 

99.96 

99.96 

99.97 

99.97 

3.6 

99.97 

99.97 

99.97 

99.97 

99.97 

99.97 

99.97 

99.98 

99.98 

99.98 

3.7 

99.98 

99.98 

99.98 

99.98 

99.98 

99.98 

99.98 

99.98 

99.98 

99.98 

3.8 

99.99 

99.99 

99.99 

99.99 

99.99 

99.99 

99.99 

99.99 

99.99 

99.99 

3.9 

99.99 

99.99 

99.99 

99.99 

99.99 

99.99 

99.99 

99.99 

99.99 

99.99 
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Table 7: Standard normal distribution z -values for a specific probability content from — z 
to z. Read column-wise and add marginal column and row z. Read column-wise and add 
marginal column and row figures to find probabilities. 


Prob . 

0 .OO 

0.10 

0.20 

0.30 

0.40 

0.50 

0.60 

0.70 

0.80 

0.90 

O.ooo 

O.ooo 

0.125 

0.253 

0.385 

0.524 

0.674 

0.841 

1.036 

1.282 

1.645 

0.002 

0 . OO 2 

0.128 

0.256 

0.388 

0.527 

0.677 

0.845 

1.041 

1.287 

1.655 

0.004 

0 . OO 5 

0.130 

0.258 

0.390 

0.530 

0.681 

0.849 

1.045 

1.293 

1.665 

0.006 

0 . OO 7 

0.133 

0.261 

0.393 

0.533 

0.684 

0.852 

1.049 

1.299 

1.675 

0.008 

0 . O 1 O 

0.135 

0.263 

0.396 

0.536 

0.687 

0.856 

1.054 

1.305 

1.685 

O.oio 

0.012 

0.138 

0.266 

0.398 

0.538 

0.690 

0.859 

1.058 

1.311 

1.696 

0.012 

0.015 

0.141 

0.268 

0.401 

0.541 

0.693 

0.863 

1.063 

1.317 

1.706 

0.014 

0.017 

0.143 

0.271 

0.404 

0.544 

0.696 

0.867 

1.067 

1.323 

1.717 

0.016 

0 . O 2 O 

0.146 

0.274 

0.407 

0.547 

0.700 

0.870 

1.071 

1.329 

1.728 

0.018 

0.022 

0.148 

0.276 

0.409 

0.550 

0.703 

0.874 

1.076 

1.335 

1.740 

0 . O 2 O 

0.025 

0.151 

0.279 

0.412 

0.553 

0.706 

0.878 

1.080 

1.341 

1.751 

0.022 

0.027 

0.153 

0.281 

0.415 

0.556 

0.709 

0.881 

1.085 

1.347 

1.763 

0.024 

0 . O 3 O 

0.156 

0.284 

0.417 

0.559 

0.712 

0.885 

1.089 

1.353 

1.775 

0.026 

0.033 

0.158 

0.287 

0.420 

0.562 

0.716 

0.889 

1.094 

1.360 

1.787 

0.028 

0.035 

0.161 

0.289 

0.423 

0.565 

0.719 

0.893 

1.099 

1.366 

1.800 

0 . O 3 O 

0.038 

0.163 

0.292 

0.426 

0.568 

0.722 

0.896 

1.103 

1.372 

1.812 

0.032 

0.040 

0.166 

0.295 

0.428 

0.571 

0.725 

0 . 9 OO 

1.108 

1.379 

1.825 

0.034 

0.043 

0.168 

0.297 

0.431 

0.574 

0.729 

0.904 

1.112 

1.385 

1.839 

0.036 

0.045 

0.171 

0.300 

0.434 

0.577 

0.732 

0.908 

1.117 

1.392 

1.853 

0.038 

0.048 

0.173 

0.302 

0.437 

0.580 

0.735 

0.911 

1.122 

1.399 

1.867 

0.040 

0 . O 5 O 

0.176 

0.305 

0.439 

0.582 

0.739 

0.915 

1.126 

1.405 

1.881 

0.042 

0.053 

0.179 

0.308 

0.442 

0.585 

0.742 

0.919 

1.131 

1.412 

1.896 

0.044 

0.055 

0.181 

0.310 

0.445 

0.588 

0.745 

0.923 

1.136 

1.419 

1.911 

0.046 

0.058 

0.184 

0.313 

0.448 

0.591 

0.749 

0.927 

1.141 

1.426 

1.927 

0.048 

0 . O 6 O 

0.186 

0.316 

0.451 

0.594 

0.752 

0.931 

1.146 

1.433 

1.944 

0 . O 5 O 

0.063 

0.189 

0.318 

0.453 

0.597 

0.755 

0.935 

1.150 

1.440 

1.960 

0.052 

0.065 

0.191 

0.321 

0.456 

0.600 

0.759 

0.938 

1.155 

1.447 

1.978 

0.054 

0.068 

0.194 

0.323 

0.459 

0.603 

0.762 

0.942 

1.160 

1.454 

1.996 

0.056 

0 . O 7 O 

0.196 

0.326 

0.462 

0.606 

0.765 

0.946 

1.165 

1.461 

2.015 

0.058 

0.073 

0.199 

0.329 

0.464 

0.609 

0.769 

0.950 

1.170 

1.469 

2.034 

0 . O 6 O 

0.075 

0.202 

0.331 

0.467 

0.612 

0.772 

0.954 

1.175 

1.476 

2.054 

0.062 

0.078 

0.204 

0.334 

0.470 

0.615 

0.775 

0.958 

1.180 

1.484 

2.075 

0.064 

0.080 

0.207 

0.337 

0.473 

0.619 

0.779 

0.962 

1.185 

1.491 

2.097 

0.066 

0.083 

0.209 

0.339 

0.476 

0.622 

0.782 

0.966 

1.190 

1.499 

2.121 

0.068 

0.085 

0.212 

0.342 

0.478 

0.625 

0.786 

0.970 

1.195 

1.507 

2.145 

0 . O 7 O 

0.088 

0.214 

0.345 

0.481 

0.628 

0.789 

0.974 

1.200 

1.514 

2.171 

0.072 

0.090 

0.217 

0.347 

0.484 

0.631 

0.792 

0.978 

1.206 

1.522 

2.198 

0.074 

0.093 

0.219 

0.350 

0.487 

0.634 

0.796 

0.982 

1.211 

1.530 

2.227 

0.076 

0.095 

0.222 

0.353 

0.490 

0.637 

0.799 

0.986 

1.216 

1.539 

2.258 

0.078 

0.098 

0.225 

0.355 

0.493 

0.640 

0.803 

0.990 

1.221 

1.547 

2.291 

0 . O 8 O 

0.100 

0.227 

0.358 

0.495 

0.643 

0.806 

0.994 

1.227 

1.555 

2.327 

0.082 

0.103 

0.230 

0.361 

0.498 

0.646 

0.810 

0.999 

1.232 

1.564 

2.366 

0.084 

0.105 

0.232 

0.363 

0.501 

0.649 

0.813 

1.003 

1.237 

1.572 

2.409 

0.086 

0.108 

0.235 

0.366 

0.504 

0.652 

0.817 

1.007 

1.243 

1.581 

2.458 

0.088 

0.110 

0.237 

0.369 

0.507 

0.655 

0.820 

1.011 

1.248 

1.590 

2.513 

0 . O 9 O 

0.113 

0.240 

0.371 

0.510 

0.659 

0.824 

1.015 

1.254 

1.599 

2.576 

0.092 

0.115 

0.243 

0.374 

0.513 

0.662 

0.827 

1.019 

1.259 

1.608 

2.652 

0.094 

0.118 

0.245 

0.377 

0.515 

0.665 

0.831 

1.024 

1.265 

1.617 

2.748 

0.096 

0.120 

0.248 

0.379 

0.518 

0.668 

0.834 

1.028 

1.270 

1.626 

2.879 

0.098 

0.123 

0.250 

0.382 

0.521 

0.671 

0.838 

1.032 

1.276 

1.636 

3.091 
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Table 8: Percentage points of the f-distribntion 







1 

— a 





n 

0.60 

0.70 

0.80 

0.90 

0.95 

0.975 

0.990 

0.995 

0.999 

0.9995 

1 

0.325 

0.727 

1.376 

3.078 

6.314 

12.71 

31.82 

63.66 

318.3 

636.6 

2 

0.289 

0.617 

1.061 

1.886 

2.920 

4.303 

6.965 

9.925 

22.33 

31.60 

3 

0.277 

0.584 

0.978 

1.638 

2.353 

3.182 

4.541 

5.841 

10.21 

12.92 

4 

0.271 

0.569 

0.941 

1.533 

2.132 

2.776 

3.747 

4.604 

7.173 

8.610 

5 

0.267 

0.559 

0.920 

1.476 

2.015 

2.571 

3.365 

4.032 

5.893 

6.869 

6 

0.265 

0.553 

0.906 

1.440 

1.943 

2.447 

3.143 

3.707 

5.208 

5.959 

7 

0.263 

0.549 

0.896 

1.415 

1.895 

2.365 

2.998 

3.499 

4.785 

5.408 

8 

0.262 

0.546 

0.889 

1.397 

1.860 

2.306 

2.896 

3.355 

4.501 

5.041 

9 

0.261 

0.543 

0.883 

1.383 

1.833 

2.262 

2.821 

3.250 

4.297 

4.781 

10 

0.260 

0.542 

0.879 

1.372 

1.812 

2.228 

2.764 

3.169 

4.144 

4.587 

11 

0.260 

0.540 

0.876 

1.363 

1.796 

2.201 

2.718 

3.106 

4.025 

4.437 

12 

0.259 

0.539 

0.873 

1.356 

1.782 

2.179 

2.681 

3.055 

3.930 

4.318 

13 

0.259 

0.538 

0.870 

1.350 

1.771 

2.160 

2.650 

3.012 

3.852 

4.221 

14 

0.258 

0.537 

0.868 

1.345 

1.761 

2.145 

2.624 

2.977 

3.787 

4.140 

15 

0.258 

0.536 

0.866 

1.341 

1.753 

2.131 

2.602 

2.947 

3.733 

4.073 

16 

0.258 

0.535 

0.865 

1.337 

1.746 

2.120 

2.583 

2.921 

3.686 

4.015 

17 

0.257 

0.534 

0.863 

1.333 

1.740 

2.110 

2.567 

2.898 

3.646 

3.965 

18 

0.257 

0.534 

0.862 

1.330 

1.734 

2.101 

2.552 

2.878 

3.610 

3.922 

19 

0.257 

0.533 

0.861 

1.328 

1.729 

2.093 

2.539 

2.861 

3.579 

3.883 

20 

0.257 

0.533 

0.860 

1.325 

1.725 

2.086 

2.528 

2.845 

3.552 

3.850 

21 

0.257 

0.532 

0.859 

1.323 

1.721 

2.080 

2.518 

2.831 

3.527 

3.819 

22 

0.256 

0.532 

0.858 

1.321 

1.717 

2.074 

2.508 

2.819 

3.505 

3.792 

23 

0.256 

0.532 

0.858 

1.319 

1.714 

2.069 

2.500 

2.807 

3.485 

3.768 

24 

0.256 

0.531 

0.857 

1.318 

1.711 

2.064 

2.492 

2.797 

3.467 

3.745 

25 

0.256 

0.531 

0.856 

1.316 

1.708 

2.060 

2.485 

2.787 

3.450 

3.725 

26 

0.256 

0.531 

0.856 

1.315 

1.706 

2.056 

2.479 

2.779 

3.435 

3.707 

27 

0.256 

0.531 

0.855 

1.314 

1.703 

2.052 

2.473 

2.771 

3.421 

3.690 

28 

0.256 

0.530 

0.855 

1.313 

1.701 

2.048 

2.467 

2.763 

3.408 

3.674 

29 

0.256 

0.530 

0.854 

1.311 

1.699 

2.045 

2.462 

2.756 

3.396 

3.659 

30 

0.256 

0.530 

0.854 

1.310 

1.697 

2.042 

2.457 

2.750 

3.385 

3.646 

40 

0.255 

0.529 

0.851 

1.303 

1.684 

2.021 

2.423 

2.704 

3.307 

3.551 

50 

0.255 

0.528 

0.849 

1.299 

1.676 

2.009 

2.403 

2.678 

3.261 

3.496 

60 

0.254 

0.527 

0.848 

1.296 

1.671 

2.000 

2.390 

2.660 

3.232 

3.460 

70 

0.254 

0.527 

0.847 

1.294 

1.667 

1.994 

2.381 

2.648 

3.211 

3.435 

80 

0.254 

0.526 

0.846 

1.292 

1.664 

1.990 

2.374 

2.639 

3.195 

3.416 

90 

0.254 

0.526 

0.846 

1.291 

1.662 

1.987 

2.368 

2.632 

3.183 

3.402 

100 

0.254 

0.526 

0.845 

1.290 

1.660 

1.984 

2.364 

2.626 

3.174 

3.390 

110 

0.254 

0.526 

0.845 

1.289 

1.659 

1.982 

2.361 

2.621 

3.166 

3.381 

120 

0.254 

0.526 

0.845 

1.289 

1.658 

1.980 

2.358 

2.617 

3.160 

3.373 

oo 

0.253 

0.524 

0.842 

1.282 

1.645 

1.960 

2.326 

2.576 

3.090 

3.291 
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Table 9: Expressions for the Beta function B(m,n ) for integer and half-integer arguments 


n —> 

m l 

1 

2 

l 

3 

2 

2 

5 

2 

3 

7 

2 

4 

9 

2 

5 

l 











2 

7T 










l 

2 

l 









3 

1 ^ 

2 

1 ^ 








2 

2 ;I 

3 

8^ 








9 

4 

1 

4 

1 








3 

2 

15 

6 







5 

3_ 

2 

1 7T 

4 

3 7T 






2 

8^ 

5 

16 11 

35 

128 /l 






Q 

16 

1 

16 

1 

16 

1 






15 

3 

105 

12 

315 

30 





7 

5 TT 

2 

5 7T 

4 

3 TV 

16 

5 7T 




2 

16 4 

7 

128 n 

63 

256 

693 

1024 n 




A 

32 

1 

32 

1 

32 

1 

32 

1 




35 

4 

315 

20 

1155 

60 

3003 

140 



9 

35 _ 

2 


4 

7 ^ 

16 

5 7T 

32 

35 7T 


2 

-7T 

128 

9 

-7T 

256 

99 

1024 '* 

1287 

2048 n 

6435 

32768 n 


e; 

256 

1 

256 

1 

256 

1 

256 

1 

256 

1 


315 

5 

3465 

30 

15015 

105 

45045 

280 

109395 

630 

11 

63 1T 

2 

21 7T 

4 

9 IT 

16 

45 _ 

32 

35 

256 

2 

256 n 

11 

1024 n 

143 

2048 n 

2145 

32768 

12155 

65536 n 

230945 


512 

1 

512 

1 

512 

1 

512 

1 

512 

1 


693 

6 

9009 

42 

45045 

168 

153153 

504 

415701 

1260 

13 

231 7T 

2 

33 7T 

4 

" -T 

16 

55 

32 

77 

256 

2 

1024 

13 

2048 

195 

32768 '* 

3315 

65536 

20995 

262144 

440895 

7 

2048 

1 

2048 

1 

2048 

1 

2048 

1 

2048 

1 


3003 

7 

45045 

56 

255255 

252 

969969 

840 

2909907 

2310 

15 

429 7T 

2 

429 

4 

143 

16 

143 T 

32 

91 t 

256 

2 

2048 '* 

15 

32768 

255 

65536 

4845 

262144 n 

33915 

524288 ; 

780045 

n —> 

11 

2 


6 

13 

2 

7 


15 

2 




m 1 











11 

63 7T 










2 

262144 /l 










R 

512 


1 









969969 


2772 








13 

63 7T 


512 

231 







2 

524288 71 


2028117 

4194304 “ 







7 

2048 


1 

2048 

1 







7436429 


5544 

16900975 

12012 






15 

273 


512 

231 7T 

2048 


429 




2 

4194304 n 


3900225 

8388608 

35102025 

33554432 
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Mathematical Constants 


Beta Function 


Introduction 

It is handy to have available the values of differ¬ 
ent mathematical constants appearing in many ex¬ 
pressions in statistical calculations. In this section 
we list, with high precision, many of those which 
may be needed. In some cases we give, after the ta¬ 
bles, basic expressions which may be nice to recall. 
Note, however, that this is not full explanations and 
consult the main text or other sources for details. 


For exact expressions for the Beta function for half¬ 
integer and integer values see table 9 on page 179. 


B(a , b) 


r(q)r(6) _ 

T(a + b) 

1 

/*~(i 

o 


oo 



x a ~ x 

(1 + x) a+b 


dx 


0 

See further section 42.6 on page 160. 


Some Basic Constants 


Digamma Function 



exact 

approx. 

7T 


3.14159 26535 89793 23846 

e 


2.71828 18284 59045 23536 

7 


0.57721 56649 01532 86061 



1.77245 38509 05516 02730 



0.39894 22804 01432 67794 


e = 

f^ = lim (i + IY* 

Z —' n! 7i *oo \ n 

71=0 v 7 


7 = 

f n i \ 

lim [> --Inn 

n— too \ Z — / £ / 

\fc=l / 


Gamma Function 


exact 

approx. 

r 4) 


1.77245 38509 05516 02730 

r(i) 

w* 

0.88622 69254 52758 01365 

r (|) 


1.32934 03881 79137 02047 

r(I) 


3.32335 09704 47842 55118 

r(l) 

WV* 

11.63172 83965 67448 92914 


exact approx. 


V’d) 

—7 — 2 In 2 

-1.96351 

00260 

21423 

47944 

V’d) 

V’d) + 2 

0.03648 

99739 

78576 

52056 

Wf) 

v>d)+i 

0.70315 

66406 

45243 

18723 

wi) 

v>d)+i 

1.10315 

66406 

45243 

18723 

V’d) 

V’CD + fi 

1.38887 

09263 

59528 

90151 

V’d) 

V’d) + f 

1.61109 

31485 

81751 

12373 

V’(f) 

V»(¥) + n 

1.79291 

13303 

99932 

94192 

Vd) 

*(¥) + & 

1.94675 

74842 

46086 

78807 

i’Ci) 

W¥) + il 

2.08009 

08175 

94201 

21402 

V’(f) 

V>(t) + ^ 

2.19773 

78764 

02949 

53317 

V’(i) 

-7 

-0.57721 

56649 

01532 

86061 

V’(2) 

1-7 

0.42278 

43350 

98467 

13939 

V’d) 

3 

2 / 

0.92278 

43350 

98467 

13939 

V>(4) 

ii 

6 / 

1.25611 

76684 

31800 

47273 

V’d) 

- -7 

12 / 

1.50611 

76684 

31800 

47273 

V>(6) 

137 

60 ' 

1.70611 

76684 

31800 

47273 

V’d) 

^ -7 

20 ' 

1.87278 

43350 

98467 

13939 

V’d) 

3M 

140 ' 

2.01564 

14779 

55609 

99654 

V»(9) 

761 ry 

280 ' 

2.14064 

14779 

55609 

99654 

V’(io) 

7129 — ^ 

2520 1 

2.25175 

25890 

66721 

10765 


r(z) = J t z 1 e l dt 

V’d) 

= Jd r(z) 

0 

n\ = T(n + 1 ) = nr(n) 

i/j{z + 1 ) 

= V’d) + y 

r(»+i) = 2 ^r(i) = 

V>(n) 

1W 

+ 

II 

- 


m=l 

2 2n n\ 

V’d + |) 

= —7 — 2 In ; 


1 dr ( 2 ) 

r(z) dz 


1 

m 


1 


See further section 42.2 on page 154 and reference 


771—1 


2 to — 1 


[42] for more details. 


See further section 42.3 on page 156. 
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Polygamma Function 


See also page 59 and for details reference [31]. 


exact 


approx. 



7T 2 /2 

4.93480 22005 44679 30942 

^ (2) (i) 

-14Cs 

-16.82879 66442 34319 99560 

^ ( 3 ) (I) 

7T 4 

97.40909 10340 02437 23644 

^ (4) (i) 

-744C 5 

771.47424 98266 67225 19054 


OO 

cn 

7691.11354 86024 35496 24176 

^(■* 1 ( 1 ) 

C 2 

1.64493 40668 48226 43647 

?/^ 2 )(l) 

- 2 Ca 

-2.40411 38063 19188 57080 

^ 3 >(l) 

6 C 4 

6.49393 94022 66829 14910 

^ (4) (l) 

-24Cs 

-24.88626 61234 40878 23195 


120Ce 

122.08116 74381 33896 76574 


^ n \z) 

V> (n) (i) 

^ n \\) 

ip( m \n + 1 ) 


d n d n+l 

- — ip(z) = ———r lnT(z) 
dz n v ’ dz n+1 v ’ 


(-1 )" +1 n!]T 

k =0 

(-ir +1 n!C„+i 
( 2 n+1 - l)V- (n) (l) 


1 


(z + k) 


n+1 


(-1 ) m m ! 
1 


— Cm+l + 1 + 

1 


2 m+l 


-jm+1 


V> (m) (n) + (-l) m mb 


1 

,m+l 


See further section 42.4 on page 158. 


Bernoulli Numbers 


Sum of Powers 


In many calculations involving discrete distributions 
sums of powers are needed. A general formula for 
this is given by 


k= 1 .7—0 



i-j + 1 


where Bj denotes the Bernoulli numbers (see page 
174). More specifically 


£* 

k =1 
n 

£ fc2 

k =1 

n 

£ fc3 

it=i 

n 

£ fc4 

fc=i 

n 

£ fc5 

k=l 

n 


fe=i 


n(n + 1)/2 

n(n + l)( 2 n + l )/6 

n 2 (n + l) 2 /4 = 

n(n + l)(2n + l)(3n 2 + 3n - l)/30 

n 2 (n + l) 2 ( 2 n 2 + 2 n — 1)/12 

n(n + 1) (2?r + 1) (3 n 4 + 6 n 3 — 3n + 1) /42 


See table 4 on page 174. 

Riemann’s Zeta-function 


Co 

Ci 

C 2 

C 3 

Ci 

C 5 

C 6 

C 7 

C 8 

C 9 

C 10 


exact 

00 

7T 2 /6 

tt 4 /90 

tt 6 /945 

tt 8 /9450 

7t 10 /93555 


approx. 


1.64493 40668 48226 43647 
1.20205 69031 59594 28540 
1.08232 32337 11138 19152 
1.03692 77551 43369 92633 
1.01734 30619 84449 13971 
1.00834 92773 81922 82684 
1.00407 73561 97944 33938 
1.00200 83928 26082 21442 
1.00099 45751 27818 08534 


Cn 


C2ra 


£ 


fc =1 


1 

k n 


2 2u-l n 2n\B 2n \ 

(2 n)! 
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ERRATA et ADDENDA 


Errors in this report are corrected as they are found but for those who have printed an 
early version of the hand-book we list here errata. These are thus already obsolete in this 
copy. Minor errors in language etc are not listed. Note, however, that a few additions 
(subsections and tables) have been made to the original report (see below). 


• Contents part now having roman page numbers thus shifting arabic page numbers 
for the main text. 

• A new section 6.2 on conditional probability density for binormal distribution has 
been added after the first edition 

• Section 42.6, formula, line 2, v changed into A giving 

f(x A) = h e -A|—(.1 

• Section 10.3, formula 2, line 4 has been corrected 

<t> x {t) = E(e ltx ) = e ^E(e< x ~^) = 


• Section 14.4, formula, line 2 changed to 


0(t) = E(e ltx ) = - [e^-^dx = —-— 

a J 1 — ita 

o 

• Section 18.1, figure 14 was erroneous in early editions and should look as is now 
shown in figure 73. 

• Section 27.2, line 12: change uri to vpi. 

• Section 27.6 on significance levels for the multinomial distribution has been added 
after the first edition. 

• Section 27.7 on the case with equal group probabilities for a multinomial distribution 
has been added after the first edition. 

• A small paragraph added to section 28.1 introducing the multinormal distribution. 

• A new section 28.2 on conditional probability density for the multinormal distribution 
has been added after the first edition. 

• Section 36.4, first formula, line 5, should read: 

P(r) = £^ = l-P(r + l,p) 

k =0 
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• Section 36.4, second formula, line 9, should read: 


P(r) = E 


r 



2 r + 2 )dx 


o 


• and in the next line it should read /(x; v = 2r + 2). 

• Section 42.5.2, formula 3, line 6, should read: 


OO 


r(z) =a z f 


t z x e at dt f or Re{z) > 0, Re (a ) > 0 


o 


• Section 42.6, line 6: a reference to table 9 has been added (cf below). 

• Table 9 on page 179, on the Beta function B(m,n ) for integer and half-integer argu¬ 
ments, has been added after the first version of the paper. 


These were, mostly minor, changes up to the 18th of March 1998 in order of apperance. 

In October 1998 the first somewhat larger revision was made: 

• Some text concerning the coefficient of kurtosis added in section 2 . 2 . 

• Figure 6 for the chi-square distribution corrected for a normalization error for the 
n = 10 curve. 

• Added figure 8 for the chi distribution on page 44. 

• Added section 11 for the doubly non-central F-distribution and section 12 for the 
doubly non-central f-distribution. 

• Added figure 12 for the F-distribution on page 61. 

• Added section 30 on the non-central Beta-distribution on page 108. 

• For the non-central chi-square distribution we have added figure 22 and subsections 
31.4 and 31.6 for the cumulative distribution and random number generation, respec¬ 
tively. 

• For the non-central F-distribution figure 23 has been added on page 113. Errors in 
the formulae for /(F';m,n, A) in the introduction and Z\ in the section on approxi¬ 
mations have been corrected. Subsections 32.2 on moments, 32.3 for the cumulative 
distribution, and 32.5 for random number generation have been added. 

• For the non-central f-distribution figure 24 has been added on page 116, some text 
altered in the first subsection, and an error corrected in the denominator of the 
approximation formula in subsection 33.5. Subsections 33.2 on the derivation of the 
distribution, 33.3 on its moments, 33.4 on the cumulative distribution, and 33.6 on 
random number generation have been added. 
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• A new subsection 34.8.9 has been added on yet another method, using a ratio between 
two uniform deviates, to achieve standard normal random numbers. With this change 
three new references [38,39,40] were introduced. 

• A comparison of the efficiency for different algorithms to obtain standard normal 
random numbers have been introduced as subsection 34.8.10. 

• Added a comment on factorial moments and cumulants for a Poisson distribution in 
section 36.2. 

• This list of “Errata et Addenda” for past versions of the hand-book has been added 
on page 182 and onwards. 

• Table 2 on page 172 and table 3 on page 173 for extreme significance levels of the 
chi-square distribution have been added thus shifting the numbers of several other 
tables. This also slightly affected the text in section 8.10. 

• The Bernoulli numbers used in section 15.4 now follow the same convention used e.g. 
in section 42.3. This change also affected the formula for k, 2 n in section 23.4. Table 
4 on page 174 on Bernoulli numbers was introduced at the same time shifting the 
numbers of several other tables. 

• A list of some mathematical constants which are useful in statistical calculations have 
been introduced on page 180. 


Minor changes afterwards include: 

• Added a “proof” for the formula for algebraic moments of the log-normal distribution 
in section 24.2 and added a section for the cumulative distribution as section 24.3. 

• Added formula also for c < 0 for F(x) of a Generalized Gamma distribution in section 
18.2. 

• Corrected bug in Erst formula in section 6.6. 

• Replaced table for multinormal confidence levels on page 100 with a more precise one 
based on an analytical formula. 

• New section on sums of powers on page 181. 

• The illustration for the log-normal distribution in Figure 16 in section 24 was wrong 
and has been replaced. 
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